EGR-EGFR complex

ABSTRACT

The present invention relates to a crystal of a complex of an epidermal growth factor (EGF) and an epidermal growth factor receptor (EGFR), a crystal of a complex of EGFR and a substance regulating EGFR activity, structure coordinates of these crystals, a method for screening for the substance regulating EGFR activity, a method for designing the substance regulating EGFR activity, a method for designing an EGFR variant or an EGF variant, a method for producing an EGFR variant or an EGF variant and an EGF variant or an EGFR variant obtainable by such method, a method for designing an epitope using the structure coordinates of the EGF-EGFR complex, a method for producing an anti-EGFR antibody or an anti-EGF antibody and an antibody obtainable by such method, a polypeptide or a salt thereof comprising a region that forms an EGFR dimerization site, and the like.

CROSS REFERENCE TO RELATED APPLICATIONS

This application is a National Phase Application based onPCT/JP02/09332, filed Sep. 12, 2002, the content of which isincorporated herein by reference, and claims the priority of JapanesePatent Application No. 2002-28780, filed Feb. 5, 2002.

TECHNICAL FIELD

The present invention relates to a crystal of the complex of epidermalgrowth factor (hereinafter, it may be referred to as “EGF” or “EGFligand”) and epidermal growth factor receptor (hereinafter, it may bereferred to as “EGFR” or “EGF receptor”) and a crystal of the complex ofEGFR and a substance regulating EGFR activity; a method forcrystallizing the complex of EGFR and the substance regulating EGFRactivity (particularly, EGF) that can be subjected to X-ray crystalstructure analysis; structure coordinates of the complex of EGFR and thesubstance regulating EGFR activity (particularly, EGF) that isobtainable by structurally analyzing the crystal; a method for screeningfor and a method for designing the substance regulating EGFR activityusing the structure coordinates; a method for identifying an EGF-EGFRbinding site or an EGFR dimerization site in the EGF-EGFR complex; amethod for designing a pharmacophore of the substance regulating EGFRactivity; a method for screening for and a method for designing thesubstance regulating EGFR activity using the pharmacophore of thesubstance regulating EGFR activity and an EGFR antagonist that fitspharmacophores; a method for inhibiting EGFR activity using an EGFRantagonist; a method for designing and a method for producing an EGFRvariant using the structure coordinates of the EGF-EGFR complex, an EGFRvariant obtainable by the production method; a method for designing anda method for producing an EGF variant using the structure coordinates ofthe EGF-EGFR complex and the EGF variant obtainable by the productionmethod; a method for obtaining structure coordinates of a protein withan unknown structure using the structure coordinates of the EGF-EGFRcomplex and the structure coordinates obtainable by such method; amethod for designing an epitope using the structure coordinates of theEGF-EGFR complex; a method for producing an anti-EGFR antibody and ananti-EGFR antibody obtainable by such method; a method for producing ananti-EGF antibody and an anti-EGF antibody obtainable by such method; apolypeptide containing a region that forms an EGFR dimerization site ora salt thereof; and the like.

BACKGROUND ART

Epidermal growth factor receptor (EGFR) is a member of the receptortyrosine kinase superfamily. EGFR is known to be involved in regulationof cell proliferation, maturation, and differentiation (Carpenter, G. &Cohen, S. J. Biol. Chem. 265, 7709-7712 (1990)).

Binding epidermal growth factor (EGF) to the extracellular domain ofEGFR is thought to induce receptor dimerization, which brings thecytoplasmic tyrosine kinase domain of the two receptors into closeproximity, resulting in the activation of intrinsic tyrosine kinasereceptors in the intracellular domain, followed by the activation ofnumerous downstream signal pathways (Schlessinger, J. Cell 103, 211-225(2000)). Besides this activation mechanism, it has been recentlydemonstrated that a portion of EGFR activated by EGF is translocated toa nucleus so that it might function as a transcription factor toactivate genes required for high-level proliferation activities (Lin, S.Y. et al., Nat. Cell Biol. 3, 802-808 (2001)). In the meantime,spontaneous oligomerization accompanied by tyrosine phosphorylation hasbeen reported for oncogenic mutants lacking either a portion or most ofthe extracellular domain (Haley, J. D. et al., Oncogene 4, 273-283(1989); Huang, H. S. et al., J. Biol. Chem. 272, 2927-2935 (1997)). Theextracellular domain also likely plays a critical role in suppression ofligand-dependent spontaneous oligomerization.

Three homologues of EGFR (ErbB-2, ErbB-3, and ErbB-4) have beenidentified in humans. Numerous studies have demonstrated that, inaddition to homo-dimerization, these EGF ligands also induce acombinational hetero-oligomerization of different pairs of the EGFRfamily members (Olayioye, M. A. et al., Embo J. 19, 3159-3167 (2000)).

The three-dimensional structure of a human EGF monomer has been analyzedusing high-resolution NMR (J. Mol. Biol. (1992) 227, 271-282). As aresult, 2 helical segments (Leu8-Tyr13 and Leu47-Glu51) have beendiscovered. The first segment has been reported to form a major β-sheetvia disulfide bridges (Cys6-Cys20 and Cys14-Cyh31), and the secondsegment has been reported to form a type II turn on the C-terminus ofthe protein. This helix has been reported to show amphipathic featureswith Leu47, Trp50, Trp49, and Leu52 on the hydrophobic surface, andLys48 and Glu51 on the hydrophilic surface. This helix is thought toparticipate in the formation of a hydrophobic core (Val34, Arg45, andTrp50) in the periphery of Tyr37, which is a conserved residue. Inrecent years, analytical results achieved using NMR on EGF dimerizationhave been reported (J. Biol. Chem (2001) 276, 34913-34917). As a result,3 disulfide bridges (Cys6-Cys20, Cys14-Cys31, and Cys33-Cys42) have beenconfirmed, and it has been revealed that it consists of an N-domain(residues 1-32) and a C-domain (residues 33-53). The N-domain has beenreported to have an irregular N-terminal peptide segment (residues 1-12)and an anti-parallel β-sheet (residues 19-23 and residues 28-32). Inaddition, the C-domain has been reported to contain a shortanti-parallel β-sheet (residues 36-38 and residues 44-46) and aC-terminal segment (residues 48-53).

Furthermore, research has been conducted on mutation of Arg41 and Leu47.As a result, it is known that these residues are essential for thebinding of EGF with its receptor, and substitution of arginine withlysine is not allowed (Mol. Cell. Biol (1989) 9, 4083-4086; FEBS Letters(1990) 261, 392-396; FEBS Letters (1990) 271, 47-50; Biochemistry (1991)30, 8891-8898; Proc. Nat. Acad. Sci., USA (1989) 86, 9836-9840). Thisindicates that the arginine side chain of EGF (guanidino group)participates in specific interaction with EGFR. In addition, researchhas been conducted with point mutation, where altered amino acids havebeen prepared for Ile23, Ala25, Leu26, Ala30, and Asn32, and experimentsand studies have been conducted using them, suggesting that each residuemay be an amino acid directly interacting with EGFR. However, thoughamino acid residues important for the interaction are increasinglypresumed by point mutation experiments, information needed forindustrial application has not yet been obtained in the currentsituation because the active conformation of the amino acid side chains,the mode of interaction with EGFR, and the active conformation of theEGFR amino acid side chains are unknown.

Regarding EGFR, although efforts have been made to elucidate the crystalstructure, such elucidation has not yet been achieved (J. Biol. Chem(1990), 265, 22082-22085; Acta Crystallogr D Biol Crystallogr (1998) 54,999-1001), and the three-dimensional structure has been merely modeledby homology modeling (Biochim. Biophys. Acta (2001) 1550, 144-152). Inthis report, the model structure of EGFR has been built using an insulinreceptor and a lymphocyte protein-tyrosine kinase as templates.Moreover, there is a report in which IGF-1R (insulin-like growthfactor-1 receptor) has been used as a template (Jorissen R. N. et al.,Protein Sci. 2000, 9(2), 310-324; WO 99/62955).

The point mutation experiment conducted by causing mutation of Glu367,Gly441, and Glu472 to result in Lys has revealed that mutations ofGlu367Lys and Glu472Lys do not affect the binding with ligands. On theother hand, the mutation of Gly441Lys has been reported to significantlyreduce the affinity with EGF (Biochemistry (2001) 40, 8930-8939).

Currently, the development of drug candidate compounds targeting EGFR isin progress (Drugs 2000; Vol. 60 Suppl. 1: 15-23, Clinical CancerResearch 2001; Vol. 7: 2958-2970). Furthermore, many low molecularinhibitors suppressing the kinase activity in the intracellular domainof EGFR have been reported (Drugs 2000; Vol. 60 Suppl. 1: 25-32).However, among low molecular weight compounds, neither medicaments thatselectively promote activation by binding to EGFR extracellular domain,nor medicaments that selectively inhibit activation have been marketedas drugs. In general, compared with antibodies and recombinant proteinpharmaceutical preparations, the agonists and the antagonists of the lowmolecular weight compounds can be orally administered in many cases, sothat they are more useful as drugs. Therefore, lowering the molecularweight of a protein pharmaceutical preparation has been attempted formany diseases. However, the development of such a preparation generallynecessitates further trial and error, so that the development of a druguseful for patients always requires a long time and a high cost.Furthermore, regarding an antagonist targeting an intracellular kinasedomain, it is predicted that the backbone is limited because theantagonist needs to permeate a cell membrane, and an excessive dosethereof is required. In addition, there are no experimentalthree-dimensional structures (crystal structures or NMR structures) ofEGFR kinase domains, so that many inductions and syntheses on atrial-and-error basis are required for the provision of selectivity.Moreover, antagonists acting on an ATP binding region intramolecularlycontain a pharmacophore having ATP-like chemical properties in manycases. Thus, such an antagonist will often be a nucleic acid analogue interms of the properties of a compound, and future side effects posedifficulties.

SUMMARY OF THE INVENTION

Under such circumstances, technology for mimicking interaction sitesbetween EGF and EGFR has been eagerly desired in order to discover a lowmolecular weight compound that inhibits the interaction between EGF andEGFR or a low molecular weight compound that selectively binds with andactivates an EGFR extracellular domain and transduces an EGF signal.Construction and design of EGFR agonists and EGFR antagonists or agonistantibodies and antagonist antibodies by molecular design techniquesenable preparation of drugs selective against EGF/EGFR far morereasonably than conventional techniques. For such molecular design basedon structure information or computer screening based on structureinformation, the experimental structure and the analytical resultsregarding the EGF-EGFR complex have a very important meaning.

However as described above, regarding EGF ligands, although thestructures of monomers or homodimers have been obtained based on NMRstructures, the active conformation of the amino acid side chains or themain chain upon binding with EGFR is unknown. In the case of EGFR, theEGFR structure in an unliganded state has been merely predicted bymodeling. Furthermore, the structure of the EGF-EGFR complex has beencompletely unknown. It has been impossible to infer active conformationwhen the two bind to each other and are activated based on the receptorstructure in an unliganded state and the ligand structure of a simplesubstance. Hence, inventions relating to the specification ofpharmacophores obtained by analysis of the crystal structure of theEGF-EGFR complex and the interaction site thereof and techniques formolecular designing have been extremely desired.

As a result of intensive studies to solve the above problems and toprovide a good EGFR agonist or a good EGFR antagonist, we have succeededin establishing a technique for crystallizing EGF-EGFR complexes whosethree-dimensional structure can be specified by X-ray crystal structureanalysis. We have revealed the structure coordinates of the complexusing such crystal, which many researchers have attempted but none ofthem have achieved. Through further analysis thereof, we have completedthe present invention by providing a method for screening for and amethod for designing the substance regulating EGFR activity; a methodfor identifying an EGF-EGFR binding site or an EGFR dimerization site inthe EGF-EGFR complex; a method for designing a pharmacophore of thesubstance regulating EGFR activity (particularly, EGF); a method forscreening for and a method for designing the substance regulating EGFRactivity using the pharmacophore of the substance regulating EGFRactivity and an EGFR antagonist that fits pharmacophores; a method forinhibiting EGFR activity using an EGFR antagonist; a method fordesigning and a method for producing an EGFR variant using the structurecoordinates of the EGF-EGFR complex, an EGFR variant obtainable by theproduction method; a method for designing and a method for producing anEGF variant using the structure coordinates of the EGF-EGFR complex andthe EGF variant obtainable by the production method; a method forobtaining structure coordinates of a protein with an unknown structureusing the structure coordinates of the EGF-EGFR complex and thestructure coordinates obtainable by such method; a method for designingan epitope using the structure coordinates of the EGF-EGFR complex; amethod for producing an anti-EGFR antibody and an anti-EGFR antibodyobtainable by such method; a method for producing an anti-EGF antibodyand an anti-EGF antibody obtainable by such method; a polypeptidecontaining a region that forms an EGFR dimerization site or a saltthereof; and the like.

The present invention is explained in detail as follows.

1. Crystal and Method for Producing Crystal

The present invention provides a crystal of the EGF-EGFR complex. Thecrystal is characterized in that EGF binds to EGFR at a 1:1 ratio, andthe EGF-bound EGFRs (wherein each EGFR protein is in a state of beingbound with each EGF protein) form a dimer. More specifically, theEGF-bound EGFR is dimerized by receptor-to-receptor binding not via EGF.The EGF-bound EGFR is, as shown in FIG. 1 a, dimerized via domain II ofEGFR, and more specifically, via each region consisting of 240^(th) to267^(th) amino acids of the amino acid sequence of human EGFR shown inSEQ ID NO: 1. Moreover, EGF interacts with domains I and III of EGFR.Domains I and III of each EGFR curve to the other side of a dimerizationsite while holding EGF between domains. Any crystals of EGF-EGFRcomplexes having such a feature are encompassed in the scope of thepresent invention. In the EGF-EGFR complex, the interaction site of EGFand EGFR is referred to as an “EGF-EGFR binding site” and a site whereEGF-bound EGFR interacts with EGF-bound EGFR to be dimerized is referredas a “dimerization site” or an “EGFR dimerization site.” Moreover, inthe case a ligand-receptor complex that is formed in a mechanism similarto that of EGF/EGFR, an interaction site of the ligand with the receptoris similarly referred to as a “ligand-receptor binding site” and a sitewhere the ligand-bound receptors interact with each other to bedimerized is referred to as a “dimerization site” or a “receptordimerization site.”

EGF and EGFR used in the present invention are derived from a mammal,preferably a mouse or a human, and particularly preferably a human. Theamino acid sequences of EGF and EGFR are known. Human EGF (maturepeptide) has been registered with the NCBI Protein DB under theaccession number AAA72173 (SEQ ID NO: 2). Mouse EGF has been registeredwith the NCBI Protein DB under accession number NP_(—)034243. Rat EGFhas been registered with the NCBI Protein DB under accession numberNP_(—)036974. Human EGFR has been registered with the NCBI Protein DBunder the accession number NP_(—)005219 (SEQ ID NO: 15). Mouse EGFR hasbeen registered with the NCBI Protein DB under the accession numberCAA55587. Rat EGFR has been registered with the NCBI Protein DB underthe accession number AAF14008. For EGF and EGFR derived from otherspecies, sequence information can be obtained from known databases.

Furthermore, under conditions where the function is maintained, aprotein consisting of an amino acid sequence derived from a native aminoacid sequence by deletion, substitution and/or insertion of 1 or more,preferably 1 or several amino acids, or a protein to which an aminoacid(s) is added to the N-terminus and/or the C-terminus is alsoencompassed. Generally, it is considered that such a slight differencein the primary structure will not largely affect the entirethree-dimensional structure, and the function is maintained. Thefull-length amino acid sequence of human EGFR is shown in SEQ ID NO: 15.Since a region consisting of the 1st to the 24th amino acids of theamino acid sequence shown in SEQ ID NO: 15 is removed as a signalsequence, native mature human EGFR is composed of the 25th to the 1210thamino acids of the amino acid sequence shown in SEQ ID NO: 15.

As EGFR, extracellular domain of EGFR are preferably used. EGFR has astructure penetrating one time a cell membrane, and is composed of, fromthe N-terminus, the extracellular domain, a transmembrane region, and anintracellular domain having tyrosine kinase region andautophosphorylation sites. Furthermore, the extracellular domain iscomposed of 4 domains (from the N-terminus, referred to as domain I,domain II, domain III, and domain IV, or also referred to as L1, S1, L2,and S2 domains, respectively) (Bajaj, M. et. al. Biochim. Biophys. Acta916, 220-226 (1987)). Domain I of human EGFR is composed of the 1st tothe 165th amino acids of the amino acid sequence of mature human EGFR,domain II is composed of the 166th to the 312nd amino acids of the same,domain III is composed of the 313rd to the 512nd amino acids of thesame, and domain IV is composed of the 513rd to the 619th amino acids ofthe same. It has been revealed according to the present invention,domains I, II, and III of EGFR participate in the formation of EGF-EGFRcomplex. Hence, as EGFR, extracellular domain including at least domainI, domain II, and domain III are preferably used.

1-1 Protein Purification

Collection sources of EGF to be used for crystal structure analysis inthe present invention are not specifically limited, but the liver, thespleen, or the kidney of pigs, rats, bulls, and the like can be used. Inaddition, EGF can also be collected from an extracted human liver or thelike. EGF is collected by homogenizing the above collection sources, andthen purifying soluble components by several types of columnchromatography. Furthermore, through the use of genetic engineeringtechniques, it is also possible to cause a gene encoding EGF to beexpressed by bacteria, animal cells or the like, and thus to produce theEGF in large quantities. For example, EGF can be obtained by introducinga recombinant DNA having an EGF gene incorporated therein into a hostsuch as Escherichia coli, yeast, a Chinese hamster ovary (CHO) cell, andthe like in a manner that enables the expression of the EGF gene toobtain a transformant, and then culturing the transformant.

When a bacterium such as Escherichia coli is used as a host, therecombinant DNA containing an EGF gene is required to be able toreplicate autonomously in the cell. Moreover, a transcription promoter,an SD sequence as a ribosome RNA-binding region, or the like may beligated. A transcription terminator or the like can also be insertedappropriately. A recombinant vector can be easily transferred into ahost by, for example, a method using a commercially available kit.

EGF can be obtained by culturing the above transformant and thencollecting the EGF from the culture. The “culture” means the culturesupernatant as well as any of cultured cells or cultured microorganisms,or the crushed product of cells or microorganisms. The culture method isconducted according to a general method used in culturing hosts. Forexample, when CHO cells are used as a host, the cells are cultured inmedia that are generally used for animal cells under conditions of 5%CO₂ and 37° C. Upon culture, bovine serum and antibiotics may be addedto the media if necessary.

After culture, when EGF is produced within microorganisms or withincells, target EGF is collected by disrupting the microorganisms or thecells by homogenizer treatment or the like. Subsequently, EGF isisolated and purified from the above culture using various types ofchromatography, or the like, preparing a sample.

EGFR can also be obtained according to the isolation and purificationmethods employed for EGF. In addition, a commercially available generecombinant EGF or EGFR can also be used.

To prepare the complex of EGF and EGFR, purified EGF is dissolved in asolution of purified EGFR. Dimers are prepared by binding the complexesof EGF and EGFR in a reaction system where EGF and EGFR are mixed.

An amino acid sequence of EGFR suitable for use in preparing the crystalof the present invention is shown in SEQ ID NO: 1, and an amino acidsequence of EGF suitable for use in the same is shown in SEQ ID NO: 2. Aregion consisting of the 1st to the 619th amino acids of the amino acidsequence of SEQ ID NO: 1 corresponds to a region consisting of the 1stto the 619th amino acids of native mature human EGFR (corresponding tothe 25th to the 643rd amino acids of the amino acid sequence shown inSEQ ID NO: 15). A region consisting of the 620th to the 633rd aminoacids of the amino acid sequence of SEQ ID NO: 1 contains a FLAG tag forpurification. However, examples of the amino acid sequence are notlimited to these amino acid sequences. An amino acid sequence that isderived from any one of the above amino acid sequences by mutation suchas deletion, substitution, addition, or the like of 1 or more (e.g., 2to 10) amino acids and has EGFR activity or EGF activity can also beused in the present invention. “EGFR activity” means activity capable ofat least binding to EGF, and more preferably means activity of bindingto EGF whereby dimerization is then induced. “EGF activity” meansactivity capable of at least binding to EGFR, and more preferably meansactivity of binding to EGFR and inducing EGFR dimerization by suchbinding. For example, an amino acid sequence of EGFR that can be usedherein has an amino acid sequence composed of the region consisting ofthe 25th to the 643rd amino acids of the amino acid sequence shown inSEQ ID NO: 15 extended by the addition of 1 or more amino acids to theN-terminus and/or the C-terminus.

In addition, the above amino acid can be mutated by a site-directedmutagenesis method known by persons skilled in the art, and a kittherefor can be used (e.g., Mutan-G and Mutan-K (both produced by TAKARASHUZO CO., LTD.)).

The present invention also provides a method for producingcrystallizable EGFR. The production method comprises:

-   (A) a step of producing an EGFR protein using Lec8 cells; and-   (B) a step of deglycosylating the EGFR protein using glycosidase.    The method may further comprise (C) a step of roughly purifying the    EGFR protein by salting out. We have revealed that 10 N-type sugar    chains bind to the extracellular domain of EGFR, and one of these    sugar chains is difficult to remove by glycosidase treatment. Based    on this finding, we have discovered that the use of Lec8 cells (ATCC    CRL-1737), which are cells producing proteins having terminal    sialic-acid and galactose-residue-deficient N-binding    oligosaccharides, is efficient for producing crystallizable EGFR.    Furthermore, we have discovered that rough purification using    salting out is efficient when the EGFR protein produced using the    Lec8 cells is purified. It is preferable to use ammonium sulfate    precipitation for salting out. Specific examples of glycosidase    include endoglycosidase D and endoglycosidase H. The present    invention also provides crystallizable EGFR that is obtainable by    the above production method. Such EGFR is characterized by    possessing at least activity of binding to EGF. As long as EGFR    possesses at least the activity of binding to EGF, examples of such    EGFR include a protein consisting of an amino acid sequence derived    from the native amino acid sequence by deletion, substitution,    and/or insertion of 1 or more, and preferably 1 or several amino    acids, or a protein comprising an amino acid sequence derived from    the native amino acid sequence by addition of an amino acid(s) to    the N-terminus and/or the C-terminus. Moreover, the crystallizable    EGFR of the present invention preferably has quality that is    sufficient to give a resolution of at least 10 Å, preferably 4.0 Å    or less, and more preferably 3.5 Å or less when crystals obtained    using the EGFR proteins (may be crystals of EGFR alone, or crystals    of the complex of EGFR and a substance regulating EGFR activity) are    subjected to X-ray crystal structure analysis.

EGFR obtainable by the above method can be used in the method of thepresent invention for crystallizing an EGF-EGFR complex or the complexof EGFR and a substance regulating EGFR activity. By the use of EGFR,crystals can be easily obtained.

1-2 Preparation of Protein Crystals

Subsequently, a crystal of the EGF-EGFR complex is prepared. As a methodfor crystallizing proteins, general protein crystallization techniquessuch as a vapor diffusion method, a batch method, a dialysis method, orthe like can be used. Determination of physical and chemical factorssuch as protein concentration, salt concentration, pH, types of aprecipitating agent, and temperature is important in proteincrystallization, and the determination of these factors is generallyknown by persons skilled in the art. Hence, to efficiently examine aplurality of parameters (e.g., precipitating agent, pH, and saltconcentration), it is preferred to produce phase diagrams. Once crystalshave been precipitated, the parameters are further minutely varied so asto refine conditions where the best crystals can appear.

(1) The “vapor diffusion method” involves placing a droplet of a proteinsolution containing a precipitating agent in a container that contains abuffer (external solution) containing a precipitating agent with ahigher concentration, sealing the container, and then allowing to stand.Depending on how the droplet is placed, the hanging drop method and thesitting drop method may be used. In the present invention, either ofthese methods can be employed. The hanging drop method involves spottinga small droplet of a protein solution onto a cover glass and invertingthe cover glass over a reservoir so as to seal the reservoir. Thesitting drop method involves placing an appropriate droplet stand withina reservoir, spotting a small drop of a protein solution on the dropletstand, and then sealing the reservoir using a cover glass or the like.In this case, it is preferred that the solution in the reservoircontains a precipitating agent, and the precipitating agent is containedin a small drop of protein in a small quantity.

Any appropriate solutions of precipitating agents to be used in thevapor diffusion method can be prepared by persons skilled in the art.For example, a precipitating agent may be prepared so as to contain thefollowing components (a) to (c).

-   (a) Precipitating agent: molecular weight between 400 and 2000,    polyethylene glycol (PEG) at a concentration between 0% and 50% by    weight or ammonium sulfate, MPD (methylpentane diol), or the like-   (b) Addition salt: NaCl, lithium chloride, magnesium chloride, or    the like at a concentration between 0.1 M and 0.3 M-   (c) Buffer agent: sodium phosphate, potassium phosphate, tris-HCl,    or the like    In addition, components used herein are not limited to the above    components, and components generally used by persons skilled in the    art can be used appropriately.

(2) The “batch method” involves adding the solution of a precipitatingagent to a protein solution in driblets so as to make the solutionslightly muddy, removing insoluble matters by centrifugation, puttingthe supernatant in a small test tube, sealing the tube, and thenallowing the resultant to stand.

(3) The “dialysis method” involves dialyzing a protein solution againsta buffer (external solution) containing a precipitating agent using asemipermeable membrane.

In protein crystallization, it is essential to conduct crystallizationunder conditions where complexes or dimers are maintained. In thepresent invention, the EGF/EGFR crystals can be precipitated by adding aprecipitating agent to a solution (protein solution) containing EGF andEGFR dissolved therein. Examples of solvents for the above proteinsolution are water, a buffer, and the like. As a buffer, for example,0.2 M Tris-HCl (pH 8.0), approximately 0.1 M NaCl, or the like can beused. “By adding” also means by bringing a protein solution into contactwith a buffering agent solution. For example, in the case of theEGF-EGFR complex, crystals appropriate for X-ray crystal analysis can beobtained by the vapor diffusion method under conditions of pH=7.0-9.0,protein concentration between 3 and 15 mg/ml, and temperature at 20° C.using polyethylene glycol as a precipitating agent. However, examples ofconditions are not limited to the above conditions. The method forproducing crystals of the present invention preferably comprises a stepof sugar chain treatment in a purification process for EGFR.

Furthermore, the present invention provides a method for producing acrystal of a complex of EGFR and a substance regulating EGFR activity.The production method comprises the following steps of: (A) producingcrystallizable EGFR; and (B) bringing EGFR into contact with a substanceregulating EGFR activity. A typical example of the substance regulatingEGFR activity is EGF, but is not limited thereto. The substanceregulating EGFR activity may be any of a peptide, an oligonucleotide, asynthetic compound, or a compound derived from nature, as long as it isan EGFR agonist or an EGFR antagonist (described later). The substanceregulating EGFR activity can be prepared by a technique known in theart. The crystal production method can further comprise (C) a step ofcrystallizing a complex of EGFR and the substance regulating EGFRactivity. The complex can be crystallized by employing a general proteincrystallization technique such as the above vapor diffusion method, thebatch method, or the dialysis method.

The step (C) can further comprise: (C-1) a step of bringing a solutioncontaining EGFR and the substance regulating EGFR activity into contactwith a solution containing a precipitating agent; (C-2) a step ofgenerating a crystal; and (C-3) a step of isolating the crystal. Thestep of bringing a solution containing EGFR and the substance regulatingEGFR activity into contact with a solution containing a precipitatingagent can be conducted by contact via a dialysis membrane, contact via asealed space in the hanging drop method, mixing of solutions, or thelike. The step of generating a crystal can be realized by creating asupersaturation state of a protein due to slowly increasedconcentrations of the protein or the precipitating agent in the solutioncontaining EGFR and the substance regulating EGFR activity. Furthermore,the step can also be realized by introducing a seed crystal into aprotein solution.

1-3 X-Ray Crystal Diffraction

The X-ray crystal structure analysis technique is most commonly used asa technique for elucidating the three-dimensional structure of aprotein. Specifically X-ray structure analysis involves crystallizing aprotein, exposing the crystal to monochromatized X-rays, and then, basedon the thus obtained diffraction images obtained with X-rays,elucidating the three-dimensional structure of the protein. The presentinvention provides a method for determining structure coordinates of acomplex of EGFR and a substance regulating EGFR activity. The methodcomprises the following steps of:

-   (A) producing a crystal of a complex of EGFR and a substance    regulating EGFR activity; and-   (B) obtaining structure coordinates of the complex of EGFR and the    substance regulating EGFR activity by X-ray crystal structure    analysis using the crystal obtained by (A).

The step (A) is as explained in “1-2” above. The step (B) can furthercomprise: (b-1) a step of obtaining diffraction data by irradiating thecrystal obtained by (A) with X-rays; (b-2) a step of obtaining anelectron density map of the complex by the analysis of the diffractiondata obtained by (b-1); and (b-3) a step of obtaining the structurecoordinates of the complex by the analysis of the electron density mapobtained by (b-2). The crystal structure of the complex can be analyzedusing an X-ray diffractometer in a laboratory or a large-scalesynchrotron radiation facility (e.g., ESRF, APS, SPring-8, PF, ALS,CHESS, SRS, LLNL, SSRL, or Brookhaven), and collecting diffraction datausing a two-dimensional detector such as an imaging plate or a CCDcamera by an oscillation photography method or a Laue method, so thatthe three-dimensional structure of the complex can be revealed from thedata by X-ray crystal structure analysis. Specifically, the diffractionimages collected by X-ray diffraction experiments are processed withdata-processing software, so as to be able to calculate the diffractionintensity obtained using indices of individual diffraction spots andintegration. By conducting inverse Fourier transform using thediffraction intensities and the phase information of these diffractionspots, electron density in a three-dimensional space is obtained.However, in diffraction experiments, it is impossible in principle tomeasure information of the phase of each diffraction spot required forthe calculation of electron density. Hence, to obtain electron density,the phase which is missing information is inferred by a molecularreplacement method, a multiple isomorphous replacement method, amultiple wavelength anomalous dispersion method (MAD method), ormodified methods thereof. In accordance with the thus obtained electrondensity map, a three-dimensional model is built using software that runson a graphics workstation. After building the model, the structure isrefined by a least squares method, a maximum likelihood method, aSimulated Annealing method, or the like, thereby obtaining the finalthree-dimensional structure of the complex.

In the MAD method, synchrotron radiation is used, and the diffractiondata of crystals are measured by varying incident X-ray wavelengths.Synchrotron radiation that can be used in the MAD method can begenerated by, for example, structural biological beam line I (BL41XU) ofSPring-8 large-scale synchrotron radiation facility.

Protein crystals are often damaged by X-ray irradiation, and diffractionability becomes deteriorated. Thus, it is preferred to conducthigh-resolution X-ray diffraction by low-temperature measurement. Thelow-temperature measurement is a method that involves rapidly coolingand freezing crystals at approximately −173° C., and then collectingdiffraction data under such state. Generally when protein crystals arefrozen, treatment or the like with a solution containing a protectantsuch as glycerol is devised for the purpose of preventing crystal decayby freezing. Frozen crystals can be prepared by, for example, immersingcrystals in a stock solution supplemented with a protectant, and thendirectly immersing the crystals in liquid nitrogen so as to freeze thecrystals instantly.

In the present invention, the crystal of the complex of EGFR and thesubstance regulating EGFR activity, such as a dimer formed of EGF-EGFRcomplexes bound to each other, are X-ray diffracted using an appropriateX-ray source.

X-ray diffraction data are collected with crystals that diffract to atleast resolutions of 10 Å or less, preferably 4.0 Å or less, and morepreferably 3.5 Å or less so as to be able to analyze in detail thethree-dimensional structure of the complex.

We have analyzed the crystal structure of the complex of EGF and EGFRusing a crystal structure analysis technique using X-rays. The crystalof a complex of EGF having the amino acid sequence shown in SEQ ID NO: 2and EGFR having the amino acid sequence shown in SEQ ID NO: 1 belongs tospace group P3₁21 and has a size in terms of unit cell parameters:a=220.2±1.5 Å, b=220.2±1.5 Å, and c=113.1±1.5 Å in the directions of thea, b, and c axes, respectively. Specifically, the crystal of theEGF-EGFR complex of the present invention is characterized in that theunit cell parameters are a=220.2±1.5 Å, b=220.2±1.5 Å, and c=113.1±1.5Å. Furthermore, using the technique of crystal structure analysis byX-ray diffraction using the crystal of the complex, thethree-dimensional structure coordinates (values indicating the spatialand positional relationship of respective atoms) of the complex of EGFand EGFR are obtained. The thus obtained structure coordinates arerepresented according to a notation method that is generally employed bypersons skilled in the art for the three-dimensional structurecoordinates of a protein and are shown in Table 1 and Table 2. Table 1shows structure coordinates as determined with 3.5 Å resolution, andTable 2 shows structure coordinates as determined with 3.3 Å resolution.

Data in Table 1 and Table 2 are presented according to the format of theprotein data bank (PDB). The PDB format contains, for example,coordinates of each atom composing a relevant protein molecule, and is astandard format for handling the coordinates of biological molecules.Among symbols or numerals used in Table 1 or Table 2, “ATOM” as recordedin the column on the left-most side (the first column) indicates eachatom represented by atomic coordinates. “HETATM” in Table 2 indicatesthe same, except for indicating that an atom denoted in the line doesnot belong to an amino acid (e.g., NAG or water molecule). “TER”indicates the C-terminus of a peptide chain. Numerals in the column (thesecond column) to the right of this column are atom serial numbers(1-8767 or 1-8896), and Roman letter recorded in the column (the thirdcolumn) to the right side of this column indicate types of atom (seebelow).

-   -   C: a carbon atom of an amino acid residue    -   N: a nitrogen atom of an amino acid residue    -   O: an oxygen atom of an amino acid residue    -   S: a sulfur atom of an amino acid residue

Roman letters (e.g., A, B, D, and G) recorded together to the right sideof the above atoms indicate the positional relationship of the atoms.For example, the Roman letters are recorded as CA, CB, NE, NZ, OE, SG,and the like.

Furthermore, Roman letters recorded in the column (the fourth column) tothe right side of the Roman letters indicating types of the above atomsindicate amino acid residues to which the atoms belong, and are denotedby a three letter code (e.g., “GLU,” “LYS,” and “VAL”). “NAG” of atomserial number 8614 and the following atom serial numbers indicateN-acetylglucosamine. “TIP” indicates a water molecule. Roman letters (A,B, C, and D) in the column (the fifth column) to the right side of thiscolumn are identification symbols for protein chains, each of whichrepresents EGFR1, EGFR2, EGF1, and EGF2. Numerals recorded in the column(the sixth column) to the right side of this column indicate amino acidresidue numbers numbered from the N-terminus. Amino acid residue numbersof EGFR correspond to those of the amino acid sequence described in SEQID NO: 1. Amino acid residue numbers of EGF correspond to those of theamino acid sequence described in SEQ ID NO: 2. Furthermore, columnsranging from the column (the seventh column) to the right side of thesenumerals to the ninth column indicate in turn X-coordinates (“a”coordinates) (Angstrom units), Y-coordinates (“b” coordinates) (Angstromunits), and Z-coordinates (“c” coordinates) (Angstrom units). Columnsranging from the column (the tenth column) to the right side of theninth column to the right-most column (the twelfth column) indicate inturn occupancy (e.g., “1.00”), isotropic temperature factor (forexample, 120.73 in the case of atom serial number 1, and 131.35 in thecase of atom serial number 2) in Table 1, and atom number or atom symbol(e.g., 6:C, 7:N, 8:O, and 16:S).

Lengthy table referenced here US07514240-20090407-T00001 Please refer tothe end of the specification for access instructions.

Lengthy table referenced here US07514240-20090407-T00002 Please refer tothe end of the specification for access instructions.2. Structure Coordinates of EGF-EGFR Complex

The present invention provides structure coordinates of an EGF-EGFRcomplex, and specifically, structure coordinates of a crystal of theEGF-EGFR complex, which are obtained by X-ray crystal structure analysisusing the crystal of the EGF-EGFR complex having the followingcharacteristics (A) and (B):

-   (A) EGF binds to EGFR at a 1:1 ratio; and-   (B) the EGF-bound EGFRs form a dimer.

In the present invention, “structure coordinates” are mathematicalcoordinates deduced by numerically expressing the diffraction intensityat individual diffraction spots obtained by X-ray diffraction resultingfrom electrons contained in the atoms of crystallized proteins, and thenanalyzing the numerical data. The structure is represented bythree-dimensional coordinates of the positions of the atoms of the aboveproteins. Specifically, the structure coordinates actually indicatespatial arrangement defined by each distance between molecules (atoms)composing a chemical structure. To process the spatial arrangement dataas information on a computer, relative arrangement data are processedinto numerical information (generation of coordinates) as specificcoordinates in a coordinate system. This process is required forconvenient computer processing. The real nature of the structurecoordinates is, as shown above, arrangement defined by the distancebetween respective molecules (atoms), and should not be understood ascoordinate values that are temporally specified upon computerprocessing. Furthermore, in this specification, atomic coordinatesindicate coordinates of individual atoms composing a substance (e.g., aprotein or an amino acid).

We have crystallized the extracellular domains of EGFR forming complexeswith EGF and analyzed such crystal by X-ray crystallography, so as toelucidate the mechanism whereby the binding of EGF ligands inducesdimerization and activation of the receptors. Thus, we have succeeded inanalyzing the crystal structures of domains I, II, and III among theextracellular domains of human EGFR complexed to human EGF at 3.5 Åresolution and 3.3 Å resolution. According to the analytical results, ithas been revealed that 2 receptors (EGFR) separately form complexes withligands (EGF). The complexes are in a form wherein each ligand iscaptured between the surfaces of domains I and III of each receptor.Moreover, a dimer is formed by interaction between 2 loops extrudingfrom domain II of each receptor, and is stabilized. The formation of theabove dimer shows the structural mechanism for EGF-induced receptordimerization.

The background history of the crystal structure analysis of the EGF-EGFRcomplex in the present invention and matters deduced from detailedstudies on the analytical results will be described as follows.

2-1 Structure Determination

The patent or application file contains at least one drawing executed incolor. Copies of this patent or patent application publication withdrawings will be provided by the Office upon request and payment of thenecessary fee.

In the present invention, the extracellular domain of human EGFR wasexpressed in Chinese hamster ovary Lec8 cells producing a protein withN-linked oligosaccharides lacking the terminal sialic acid and galactoseresidues, and then purified. The thus obtained proteins weredeglycosylated by digestion with endoglycosidases D and H andcrystallized with human EGF (hEGF). We also expressed, purified,deglycosylated, and crystallized selenomethionine-substitutedrecombinant proteins from Lec8 cells as the native crystals. Alldiffraction data were collected in Spring-8. Therefore, the nativecrystal structure was determined to 3.5 Å by the MAD method and themolecular replacement analysis method (FIGS. 1 a and 1 b). The resultingelectron density map could be sufficiently used for building a model forthe complex (FIG. 1 c). After the final stages of refinement, R_(cryst)and R_(free) resulted in 28% and 34% respectively, due to the relativelylarge solvent content of 75% and the relatively large missing regionscorresponding to 18% of the amino acid residues composing the complex.Data collection and refinement statistics are shown in Table 3.

TABLE 3 Diffraction data and crystallographic refinement statisticsDiffraction data statistics Native Se-Met Crystal (Peak) (Edge)(Remote 1) (Remote 2) Wavelength (Å) 1.000 0.9795 0.9798 0.9733 0.9839Resolution (Å) 50-3.5 50-4.0 50-4.0 50-4.0 50-4.0 Unique reflections39517 26902 26968 26796 26941 Total reflections 293584 214772 210007199751 198119 R_(sym) (%)^(a)  7.5(27.0)*  8.0(20.5)*  7.2(25.7)* 6.4(22.0)*  5.7(26.1)* Completeness (%) 99.0(98.6)* 98.6(96.8)*98.5(97.0)* 98.3(95.3)* 98.4(95.9)* I/σ (I) 25.6(5.1)*  24.9(4.7)* 21.7(3.2)*  22.2(3.6)*  20.5(2.9)*  MAD phasing Se atom found 13 (total20) Figure of merit  0.39 Crystallographic refinement statisticsResolution 10.0-3.5 Rmsd from ideality Number of reflections 37704 Bondlength (Å) 0.009 Number of atoms 8767 Bond angles (°) 1.6R_(work)/R_(free) (%)^(b) 28.1/34.3 *Numbers within parenthesescorrespond to the values in the highest resolution shell. ^(a)R_(sym) =Σ_(h) Σ_(i)|I_(hi) − <I_(h)>|/Σ_(h) Σ_(i)|I_(hi)| (where <I_(h)>indicates the h^(th) average intensity of unique reflection, and I_(hi)indicates i^(th) intensity observed.) ^(b)R_(work) = Σ ∥F_(o)| −|F_(c)∥/Σ |F_(o)|, R_(free) = Σ ∥F_(o)| − |F_(c)∥/Σ |F_(o)| (5% of totalreflection was used)2-2 Overall Structure

In the present invention, the crystal structure determination of adimeric EGFR-EGF complex at 3.5 Å has revealed atomic structures ofEGF-EGFR binding sites and EGFR dimerization sites. The dimer in theasymmetric unit of the crystal contains two EGF-bound EGFRs. Theligand-receptor interface consists of one site in domain I and two sitesin domain III of EGFR. A cleft, at which EGF interacts extensively withdomains I and III of the receptor, is formed by domains I, II, and IIIof EGFR. There are absolutely no interactions between EGF and the otherreceptor. The dimer structure is stabilized by receptor-receptorinteraction. First, EGF binds to EGFR, so that domains I and III of EGFRcurve to EGF side so as to hold the EGF protein within the pocket. Thecurved portions of domains are bound back-to-back via domain II (alsoreferred to as S1 domain) so as to form a dimer. At this time, the EGFRproteins are in a form resembling sword guards facing each other. Hence,the two EGF proteins are not directly in contact each other and are 79 Åapart.

Domains I and III consist of a right-handed repetitive β-sheetstructure, whose backbones are individually analogous to L1 and L2domains of the unliganded IGF-1R (insulin-like growth factor-1receptor), with the root-mean-squared difference (RMSD) values for thecorresponding Cα atoms in each domain of 2.1 Å for domain I and 4.0 Åfor domain III, respectively. A comparison with the structure ofunliganded IGF-1R has revealed a significant difference between domainsII and III orientation of the complex of the present invention and theS1-L2 orientation. The dimeric structure of the EGFR-EGF complex hasbeen shown to have features such that loops extrude from each domain II(S1 domain) of the receptor due to the binding of EGFR to EGF, and withthese extruding loops the complexes are bound to each other. Thisbinding type has not been shown in any conventional modeling structureof the extracellular domains of EGFR (Protein Sci. 2000, 9, 310-324).This binding type is a finding that has been discovered for the firsttime in the present invention, and shows new structural features. Theloop region required for this dimerization has been shown to be specificto the EGFR family based on the amino acid sequence alignment of theproteins, not present in the insulin receptor family. Domain II showsthe folding manner analogous to that of S1 of IGF-1R, except for thedifference between the loop between Cys240 and Cys267 located outsidethe curve (on the side opposite from the EGF-bound region) of theEGFR-EGF complex and the loop between Cys252 and Cys273 located insidethe IGF-1R molecule, with the RMSD for the corresponding Cα atoms of 3.2Å. The two equivalent C-terminal residues (Val481) of domain III are 77Å apart, which would not bring the cytoplasmic domains closer togetherwithout domain IV in the dimer.

In addition, in this specification, numerals described together withamino acids (three letter code) indicate amino acid numberscorresponding to those of the amino acid sequence of EGFR (SEQ ID NO: 1)when EGFR is denoted. When EGF is denoted, numerals indicate amino acidnumbers corresponding to those of the amino acid sequence of EGF (SEQ IDNO: 2). When IGF-1R is denoted, numerals indicate amino acid numberscorresponding to those of the amino acid sequence of IGF-1R (SEQ ID NO:3).

2-3 Ligand-Receptor Interaction

EGF interacts with 3 sites on EGFR consisting of site 1 in domain I, andsite 2 and site 3 in domain III (FIG. 2 a). These interfaces between theligand and the receptor are extensive such that the interface area ofdomain I reaches approximately 720 Å² and the same of domain III reaches730 Å², and the interfaces are dominated by hydrophobic interactions.Leu14, Tyr45, Leu69, and Leu98 side chains in site 1 and Met21, Ile23,and Leu26 side chains in the B-loop of EGF create hydrophobicenvironments (FIG. 2 b). Val350 and Phe357 side chains in site 2 andTyr13 and Leu15 side chains in EGF create a hydrophobic environment(FIG. 3 a). Phe412 and Ile438 side chains in site 3 and the Leu47 sidechain in EGF create a hydrophobic environment (FIG. 3 b). Amino acidresidues having the above properties are conserved among the EGF familymembers, suggesting that the above interfaces between the receptor andthe ligand may be present at conserved interaction sites of other EGFfamily members.

Site-directed mutagenesis conducted by substituting the amino acidsequence of the ligand have suggested the importance of the abovehydrophobic residues, Tyr13, Met21, Ile23 and Leu26 in receptor binding(Tadaki, D. K. & Niyogi, S. K. J. Biol. Chem. 268, 10114-10119 (1993);Campion, S. R. et al., Biochemistry 29, 9988-9993 (1990); Koide, H. etal., Biochim. Biophys. Acta 1120, 257-261 (1992)).

An experiment of substituting Ile23 of EGF by mutagenesis with Ala, Val,Leu, Phe, Trp, or the like has shown that Ile23 in EGF is required forthe tight binding with the receptor, and the binding site is known to beappropriate for the size of the isoleusine residue. This is in goodagreement with the fact that the Ile23 binding region in EGF, which isformed by Leu14, Tyr45, and Leu69 side chains in EGFR domain I, isalmost complementary to the isoleucine side chain in the crystalstructure.

Similarly, the result of an experiment conducted by substituting Tyr13of EGF by mutagenesis with Phe, Leu, Val, Ala or the like and thenbinding the EGF to the receptor suggests that Tyr13 is of an optimumsize. This is in firm agreement with the fact that the Tyr13 bindingregion, which is formed by Phe357 side chain and His10, Tyr29, and Arg41side chains in domain III, is complementary to the tyrosine side chainin the crystal structure. The Glu90 side chain in site 1 forms a saltbridge with the Lys28 side chain derived from the ligand, and the Asp355side chain in site 2 forms a salt bridge with the Arg41 side chainderived from the ligand in the crystal structure. Mutagenesis andbiochemical studies have shown that the guanidium group of Arg41 in EGFis a critical determinant of binding with the receptor (Engler, D. A. etal., J. Biol. Chem. 267, 2274-2281 (1992)). This is in firm agreementwith the analytical results of the crystal structure where the aboveside chains are placed close to the Tyr13 side chain derived from theligand and the Phe357 side chain derived from the receptor, creating ahydrophobic environment. As a result of a binding assay that the EGFhaving mutation of Lys28 with Leu by mutagenesis, the substitution withthe Leu had almost no effect on receptor binding. This is apparentlyinconsistent with the crystal structure, indicating that ionicinteractions at position 28 are not always required (Campion, S. R. etal., Biochemistry 29, 9988-9993 (1990)).

Several hydrogen bonds further fortify the ligand-receptor interface.The Gln16 side chain in site 1 forms a hydrogen bond with the Asn32 sidechain derived from the ligand. The result of substitution of Asn32 inEGF with His, Phe, Val, Asp, or the like by mutagenesis suggests thatAsn32 in EGF is required for tight binding with the receptor (Koide, H.et al., FEBS Lett. 302, 39-42 (1992)). The Gln384 side chain in site 3forms hydrogen bonds with atoms of Gln43 and Arg45 main chains in theligand.

A study on chimeric EGF-heregulin peptides (Barbacci, E. G. et al., J.Biol. Chem. 270, 9585-9589 (1995)) has shown that the N-terminal 5residues of heregulin are of specific importance for binding to theheregulin receptor (ErbB-4). However, no interactions in an N-terminalregion are observed in the crystal structure, suggesting that thedifferences in binding sites of the ligands result in different bindingsites of the receptors and may confer ligand-receptor specificitiesamong the EGFR family members. No specific interactions with aminoacids, whose properties were not conserved among the EGF family members,are observed in the crystal structure except for Asn32 in EGF. Thecorresponding position in TGF-α sharing a ligand-binding site with EGF,is substituted with Val, suggesting that Asn32 would not confer ligandspecificity on EGFR. It has been shown that the N-terminus of EGF can belinked to the residues Tyr101 and Lys336. Though no interactions betweenthe N-terminal region (disordered, so that the structure cannot be seen)of EGF and EGFR are observed in the crystal structure, the estimateddistance between the Cα of Glu5 that is the N-terminus of the EGFstructure and Tyr101 and the distance between the same and Lys336 are 16Å and 39 Å, respectively, in the crystal structure.

This indicates that the N-terminus of EGF can be linked to the Tyr101,but cannot be linked to Lys336. It is also possibly considered thatstable dimeric EGF-bound EGFR would be formed via several intermediatestates, in which EGF may bind to EGFR at some different sites of areceptor that cannot be elucidated based on the crystal structure. Theresidues 10 to 17, 85 to 94, and 101 to 107 in domain I and the residues316 to 325, 353 to 362, and 404 to 413 in domain III seem to loop outthe barrels in the form of a right-handed repetitive β-sheet structure.Among the above loops, the loops 10 to 17 and 353 to 362 participate inEGF binding. This is consistent with the loops 353 to 362 encompassingmost of the epitope (residues 351 to 364) to theligand-competitive-monoclonal antibody, LA22 (Wu, D. G. et al., J. Biol.Chem. 264, 17469-17475 (1989)).

In the above explanation, amino acid residues shown to participate inEGF-EGFR interaction based on the results of structure analysis made onthe EGF-EGFR complex are examples of amino acids creating interactionthat is important in the EGF-EGFR binding sites.

2-4 Receptor-Receptor Interaction

The interaction between the two EGFR molecules in the dimer is almostlimited to domain II (FIGS. 4 a, 4 b, and 4 c). The total surface areaburied of this interface is approximately 1270 Å². The loop betweenCys240 and Cys267, in particular, participates in the receptordimerization. Across the 2-fold rotational axis of the dimer, theTyr246, Pro248, and Tyr251 side chains create hydrophobic environmentswith the Phe230, Phe263, Ala265, Tyr275, and Arg285 side chains derivedfrom the other receptor. This is achieved by hydrogen bonds between theTy275 and Arg285 side chains.

The Gln252 side chain forms a hydrogen bond with the backbone nitrogenand carbonyl oxygen of Ala286. Hydrogen bonds between the Tyr246 sidechain and the backbone oxygen of Cys283 of the other receptor fortifythe interface. The properties of most of the above amino acid residuesare conserved among the EGFR family members, suggesting that theseinterfaces between the loops of each receptor may be present atconserved dimerization sites of other EGFR family members. In additionto the above domain II-domain II interactions, a sole interactionbetween an amino acid residue in domain II and an amino acid residue indomain I of the other receptor is also observed in the crystal structuredisclosed in this specification. In detail, the Thr249 side chainextruding from the dimerization loop forms a hydrogen bond with theAsn86 side chain derived from domain I of the other receptor.

The amino acid residues that are shown in the above explanation toparticipate in dimerization based on the results of the structureanalysis made on the EGF-EGFR complex are examples of amino acidscreating important interactions at the EGFR dimerization sites.

2-5 Mechanisms of Ligand-Dependent Receptor Dimerization

It is necessary to first understand the receptor activation mechanismresponsible for signal transduction. EGFR has been one of therepresentative targets of much research on signal transduction acrossthe membrane. During the past quarter century, biochemical and physicalbiology data concerning the activation mechanism of EGFR has beengradually accumulated. Although binding EGF to the EGFR extracellulardomain has been thought to induce stable dimer formation, dimerizationpatterns (whether dimerization is mediated by EGF or takes place on thereceptor side) have remained unknown (Lemmon, M. A. et al., Embo J. 16,281-294 (1997)). We have shown for the first time the structure of thisstable dimeric EGFR-EGF complex.

Based on comparison of the crystal structure of the EGF-EGFR complexwith that of unliganded IGF-1R, we have deduced the mechanisms ofEGF-induced receptor dimerization. The crystal structure, in whichreceptor dimerization is not mediated by interactions involving EGF butis mediated by direct interaction between domain II and domain II of the2 EGFR proteins, indicating that EGF binding to the receptor results inconformational changes that expose receptor-receptor interaction sites.One of the loop regions specific to the EGFR family lies between Cys240and Cys267 in EGFR. The significant roles of these loops functioning asdimerization sites have been elucidated for the first time from thecrystal structure of the present invention.

From the comparison with the unliganded IGF-1R, we have deduced that aligand binding to the extracellular domain of EGFR would induce a changeof inter-domain orientation at a putative hinge region (Lys303 toLys311) and bring domain III into the proximity of domain II (FIG. 5 a).We have assumed that the Gln293 side chain might make a salt bridge withthe Arg285 side chain in domain II of an unbound EGFR monomer. Thisassumption is reasonably derived from a comparison with the structure ofunbound IGF-1R, in which the R283 side chain at corresponding positionmakes a salt bridge with the Glu276 side chain in S1 domain (FIG. 5 b).The ligand-dependent conformational change would place the Arg405 sidechain in domain III within reach of the Glu293 side chain, followed bymaking a salt bridge between these side chains. The Arg285 side chainlocated away from its partner for interaction would interact with theTyr275 side chain. Consequently, a hydrophobic environment is createdtogether with the Phe263, Tyr275, and Arg285 side chains, where theTyr251 side chain extruding from domain II of the other receptor makeshydrophobic contact as shown in the crystal structure (FIG. 5 c). Wespeculate that the ligand-dependent conformational change of thereceptor may control the above interaction and other unidentifiedstructural changes.

We now discuss other potential structural changes. It would be moreeffective for ligand-dependent receptor activation to assume that thedimerization loop does not extrude from domain II and is tucked intoitself before receptor activation. For this purpose, we propose thefollowing model by tentatively assuming that the Arg273 side chain mightmake a salt bridge with the Asp254 side chain in domain II of an unboundEGFR monomer (FIG. 5 c).

An EGF-dependent conformational change of the receptor would place theGly458 main chain in domain III within reach of Arg273, followed bymaking a hydrogen bond between the oxygen of Gly458 main chain and theside chain derived from Arg273. The Asp254 side chain would be liberatedfrom every interactions, so that the dimerization loop encompassing theAsp254 extrudes from domain II, thereby permitting interactions to takeplace between the dimerization loops of each receptor.

Dimerization of the extracellular domains of EGFRs brings thecytoplasmic tyrosine kinase domains of the two receptors into closeproximity, resulting in activation of the intrinsic tyrosine kinasereceptor in the intracellular domain similarly to the case of anerythropoietin receptor (Remy, I. et al., Science 283, 990-993 (1999)).

2-6 Conclusion

We have revealed for the first time the structure of a dimeric EGFR-EGFcomplex on the basis of X-ray diffraction data at 3.5 Å and 3.3 Å. Thecrystal structure of the present invention displays the novel structuralfeature that only the two loops extruding from each domain II of thereceptor in the dimer are major dimerization sites. The crystalstructure of the present invention provides for the first time not onlyinformation by which the ligand-dependent EGFR activation is explainedon the atomic level, but also important information for developing anEGFR family antagonist or an activation inhibitor as, for example, anovel antitumor agent.

Comparison and analysis of the coordinates of EGF crystal structures andEGFR modeling structures that have been obtained by conventionaltechniques with those of the EGF-EGFR complex crystal structure obtainedin the present invention make it clear that the flow of the amino acidbackbones and the side chain conformations of the former structures donot have accuracy sufficient for use in industrial applications such asdrug design.

When EGF crystal structure 1JL9 (J. Biol. Chem. 276 pp. 34913 (2001))elucidated in 2001 and the structure of an EGFR-EGF complex werecompared, specifically, in the case of superimposition (superimpositionby the least squares method) of the Cys6-Leu47 main chain, the RMSD(root mean square deviation) value between the two was 1.5 Å in the caseof 1JL9A chain, and the RMSD value between the two was 2.9 Å in the caseof 1JL9B chain. When superimposition was conducted including side chainsimportant for exerting EGF activity, the gap between the structuresincreased, such that an RMSD value between the two was 3.5 Å in the caseof the A chain, and the RMSD value between the two was 4.2 Å in the caseof B chain. It was revealed that the EGF structure of the conventionaltechnology merely reflects the structure in a state of being unbound toEGFR, and is not sufficient information to provide the activeconformation observed in the complex with EGFR exerting pharmacologicalactivity. The only reason why the active conformation of EGF in theEGF-EGFR co-crystal structure is so different from the conformation ofEGF in a free state presented by conventional technology is that thebinding of EGF to EGFR causes the conformation or the orientation of theside chains to alter in terms of the exertion of EGF activity. Thisaltered structure itself is active conformation and the provision ofthis is useful for creating or developing drugs. Hence, it clearlyinvolves an inventive step whereby the active conformation of theEGF/EGFR has been elucidated by the present invention.

According to the three-dimensional structure of an EGFR protein ofconventional technology, no structure of the EGF-EGFR complex has beenknown, and the modeling structure of an EGFR monomer has been merelypredicted. Conventionally, a ligand-binding site of an insulin-likegrowth factor-1 receptor has been crystallized in an unliganded state,and the structure has been specified. However, since the activeconformation when the ligand is bound cannot be predicted with thisstructure, it has been difficult to provide information sufficient tocarry out scientific analysis and molecular design. In particular, theactive conformation of EGFR alters dynamically, so that it is impossibleto infer the active conformation using an unliganded structure. In thehomology modeling method, the structure of a template specifies theconformation of a target protein (EGFR monomer) model structure uponmodeling. By conducting EGFR modeling using an unliganded IGF-1Rstructure as a template, an EGFR model with conformation similar to thatof the unliganded IGF-1R can be obtained. Comparison of the crystalstructure of the insulin-like growth factor-1 receptor monomer used as atemplate for the EGFR modeling structure with the EGFR structure of theEGF-EGFR complex revealed by the present invention makes it possible tounderstand that even in visual analysis, modeling of the activeconformation using the unliganded structure is extremely difficult (seeFIG. 8).

The structure coordinates provided from the co-crystal structure of thecomplex of the present invention is useful in pharmacophore extractionof a substance regulating EGFR activity (specifically, agonists andantagonists), computer screening using all or some of the structurecoordinates of the complex, molecular design (e.g., increased activityand provision of selectivity) of EGFR agonists or EGFR antagonists,design and screening for industrially useful EGF variants or EGFRvariants, preparation of EGF neutralization antibodies and EGF agonistantibodies, the molecular replacement method utilizing the EGF-EGFRcrystal structure, modeling of a protein thought to have folds similarto those of EGFR such as an insulin receptor and the use of the modelingstructure (e.g., computer screening, molecular design, antibody design,designing of altered proteins, and the molecular replacement method),and the like. The present invention provides the structure coordinatesof the EGF-EGFR complex that can be used for these purposes.

More specifically, the present invention provides the followingstructure coordinates (a) to (zc):

-   (a) all or some of structure coordinates shown in Table 1;-   (b) structure coordinates of an EGF-EGFR complex, which are    characterized in that the root mean square deviation of α carbon    atomic coordinates is 2.0 Å or less when the structure coordinates    are superimposed on those of the EGF-EGFR complex shown in Table 1.-   (c) structure coordinates of EGFR, which are specified by atom    serial numbers 1 to 3957 from the structure coordinates shown in    Table 1;-   (d) structure coordinates of EGFR, which are characterized in that    the root mean square deviation of α carbon atomic coordinates is 2.0    Å or less when the structure coordinates are superimposed on those    of EGFR specified by atom serial numbers 1 to 3957 from the    structure coordinates shown in Table 1;-   (e) some of the structure coordinates of EGFR, comprising at least    any one of the following structure coordinates (e-1) to (e-4);

(e-1) structure coordinates of a portion corresponding to domain I ofEGFR which are specified by atom serial numbers 1 to 3957 from thestructure coordinates shown in Table 1;

(e-2) structure coordinates of a portion corresponding to domain II ofEGFR which are specified by atom serial numbers 1 to 3957 from thestructure coordinates shown in Table 1;

(e-3) structure coordinates of a portion corresponding to domain III ofEGFR which are specified by atom serial numbers 1 to 3957 from thestructure coordinates shown in Table 1;

(e-4) structure coordinates of a portion corresponding to domain I, II,or III of EGFR, which are characterized in that the root mean squaredeviation of α carbon atomic coordinates is 2.0 Å or less when thestructure coordinates are superimposed on those of the portioncorresponding to domain I, II, or III of EGFR specified by any one of(e-1) to (e-3);

-   (f) structure coordinates of EGFR, which are specified by atom    serial numbers 3958 to 7887 from the structure coordinates shown in    Table 1;-   (g) structure coordinates of EGFR, which are characterized in that    the root mean square deviation of α carbon atomic coordinates is 2.0    Å or less when the structure coordinates are superimposed on those    of EGFR specified by atom serial numbers 3958 to 7887 from the    structure coordinates shown in Table 1;-   (h) some of the structure coordinates of EGFR, comprising at least    any one of the following structure coordinates (h-1) to (h-4);

(h-1) structure coordinates of a portion corresponding to domain I ofEGFR which are specified by atom serial numbers 3958 to 7887 from thestructure coordinates shown in Table 1;

(h-2) structure coordinates of a portion corresponding to domain II ofEGFR which are specified by atom serial numbers 3958 to 7887 from thestructure coordinates shown in Table 1;

(h-3) structure coordinates of a portion corresponding to domain III ofEGFR which are specified by atom serial numbers 3958 to 7887 from thestructure coordinates shown in Table 1;

(h-4) structure coordinates of a portion corresponding to domain I, II,or III of EGFR, which are characterized in that the root mean squaredeviation of α carbon atomic coordinates is 2.0 Å or less when thestructure coordinates are superimposed on those of the portioncorresponding to domain I, II, or III of EGFR specified by any one of(h-1) to (h-3);

-   (i) structure coordinates of EGF, which are specified by atom serial    numbers 7888 to 8250 from the structure coordinates shown in Table    1;-   (j) structure coordinates of EGF, which are characterized in that    the root mean square deviation of α carbon atomic coordinates is 2.0    Å or less when the structure coordinates are superimposed on those    of EGF specified by atom serial numbers 7888 to 8250 from the    structure coordinates shown in Table 1;-   (k) structure coordinates of EGF, which are specified by atom serial    numbers 8251 to 8613 from the structure coordinates shown in Table    1;-   (l) structure coordinates of EGF, which are characterized in that    the root mean square deviation of α carbon atomic coordinates is 2.0    Å or less when the structure coordinates are superimposed on those    of EGF specified by atom serial numbers 8251 to 8613 from the    structure coordinates shown in Table 1;-   (m) all or some of structure coordinates shown in Table 2;-   (n) structure coordinates of an EGF-EGFR complex, which are    characterized in that the root mean square deviation of α carbon    atomic coordinates is 2.0 Å or less when the structure coordinates    are superimposed on those of the EGF-EGFR complex specified by the    structure coordinates shown in Table 2;-   (o) structure coordinates of EGFR, which are specified by atom    serial numbers 1 to 3957 from the structure coordinates shown in    Table 2;-   (p) structure coordinates of EGFR, which are characterized in that    the root mean square deviation of α carbon atomic coordinates is 2.0    Å or less when the structure coordinates are superimposed on those    of EGFR specified by atom serial numbers 1 to 3957 from the    structure coordinates shown in Table 2;-   (q) some of the structure coordinates of EGFR, comprising at least    any one of the following structure coordinates (q-1) to (q-4);

(q-1) structure coordinates of a portion corresponding to domain I ofEGFR which are specified by atom serial numbers 1 to 3957 from thestructure coordinates shown in Table 2;

(q-2) structure coordinates of a portion corresponding to domain II ofEGFR which are specified by atom serial numbers 1 to 3957 from thestructure coordinates shown in Table 2;

(q-3) structure coordinates of a portion corresponding to domain III ofEGFR which are specified by atom serial numbers 1 to 3957 from thestructure coordinates shown in Table 2;

(q-4) structure coordinates of a portion corresponding to domain I, II,or III of EGFR, which are characterized in that the root mean squaredeviation of α carbon atomic coordinates is 2.0 Å or less when thestructure coordinates are superimposed on those of the portioncorresponding to domain I, II, or III of EGFR specified by any one of(q-1) to (q-3);

-   (r) structure coordinates of EGFR, which are specified by atom    serial numbers 3958 to 7905 from the structure coordinates shown in    Table 2;-   (s) structure coordinates of EGFR, which are characterized in that    the root mean square deviation of α carbon atomic coordinates is 2.0    Å or less when the structure coordinates are superimposed on those    of EGFR specified by atom serial numbers 3958 to 7905 from the    structure coordinates shown in Table 2;-   (t) some of the structure coordinates of EGFR, comprising at least    any one of the following structure coordinates (t-1) to (t-4);

(t-1) structure coordinates of a portion corresponding to domain I ofEGFR which are specified by atom serial numbers 3958 to 7905 from thestructure coordinates shown in Table 2;

(t-2) structure coordinates of a portion corresponding to domain II ofEGFR which are specified by atom serial numbers 3958 to 7905 from thestructure coordinates shown in Table 2;

(t-3) structure coordinates of a portion corresponding to domain III ofEGFR which are specified by atom serial numbers 3958 to 7905 from thestructure coordinates shown in Table 2;

(t-4) structure coordinates of a portion corresponding to domain I, II,or III of EGFR, which are characterized in that the root mean squaredeviation of α carbon atomic coordinates is 2.0 Å or less when thestructure coordinates are superimposed on those of the portioncorresponding to domain I, II, or III of EGFR specified by any one of(t-1) to (t-3);

-   (u) structure coordinates of EGF, which are specified by atom serial    numbers 7906 to 8291 from the structure coordinates shown in Table    2;-   (v) structure coordinates of EGF, which are characterized in that    the root mean square deviation of α carbon atomic coordinates is 2.0    Å or less when the structure coordinates are superimposed on those    of EGF specified by atom serial numbers 7906 to 8291 from the    structure coordinates shown in Table 2;-   (w) structure coordinates of EGF, which are specified by atom serial    numbers 8292 to 8677 from the structure coordinates shown in Table    2;-   (x) structure coordinates of EGF, which are characterized in that    the root mean square deviation of α carbon atomic coordinates is 2.0    Å or less when the structure coordinates are superimposed on those    of EGF specified by atom serial numbers 8292 to 8677 from the    structure coordinates shown in Table 2;-   (ya) structure coordinates of a first EGF-EGFR binding site selected    from any one of the following (ya-1) to (ya-8);

(ya-1) structure coordinates of an EGF-EGFR binding site, comprisingatomic coordinates of amino acid residues corresponding to at leastLeu26 and Lys28 of EGF and Leu69 and Leu98 of EGFR;

(ya-2) structure coordinates of an EGF-EGFR binding site, comprisingatomic coordinates of amino acid residues corresponding to at leastLeu26 and Lys28 of EGF and Leu69 and Leu98 of EGFR, and atomiccoordinates of one or more amino acid residues selected from amino acidresidues corresponding to Met21, Ile23, Ala25, Cys31, Asn32, and Cys33of EGF, and Leu14, Gln16, Gly18, Glu35, Tyr45, Ala68, Glu90, and Tyr101of EGFR;

(ya-3) structure coordinates of an EGF-EGFR binding site, consisting ofthe atomic coordinates of the amino acid residues corresponding to atleast Leu26 and Lys28 of EGF and Leu69 and Leu98 of EGFR, and atomiccoordinates of one or more amino acid residues selected from amino acidresidues corresponding to Met21, Ile23, Ala25, Cys31, Asn32, and Cys33of EGF and Leu14, Gln16, Gly18, Glu35, Tyr45, Ala68, Glu90, and Tyr101of EGFR;

(ya-4) structure coordinates of an EGF-EGFR binding site, consisting ofatomic coordinates of amino acid residues corresponding to at leastLeu26 and Lys28 of EGF and Leu69 and Leu98 of EGFR, and atomiccoordinates of one or more amino acid residues selected from amino acidresidues corresponding to Met21, Ile23, Ala25, Cys31, Asn32 and Cys33 ofEGF, and Leu14, Gln16, Gly18, Glu35, Tyr45, Ala68, Glu90 and Tyr101 ofEGFR and amino acid residues adjacent thereto;

(ya-5) structure coordinates of an EGF-EGFR binding site, comprisingatomic coordinates of amino acid residues corresponding to at leastIle23, Leu26, and Lys28 of EGF and Leu14, Tyr45, Leu69, Glu90, and Leu98of EGFR;

(ya-6) structure coordinates of an EGF-EGFR binding site, comprisingatomic coordinates of amino acid residues corresponding to at leastIle23, Leu26, and Lys28 of EGF and Leu14, Tyr45, Leu69, Glu90, and Leu98of EGFR, and atomic coordinates of one or more amino acid residuesselected from amino acid residues corresponding to Met21, Ala25, Cys31,Asn32, and Cys33 of EGF and Gln16, Gly18, Glu35, Ala68, and Tyr101 ofEGFR;

(ya-7) structure coordinates an EGF-EGFR binding site, consisting ofatomic coordinates of the amino acid residues corresponding to at leastIle23, Leu26, and Lys28 of EGF and Leu14, Tyr45, Leu69, Glu90, and Leu98of EGFR, and atomic coordinates of one or more amino acid residuesselected from amino acid residues corresponding to Met21, Ala25, Cys31,Asn32, and Cys33 of EGF and Gln16, Gly18, Glu35, Ala68, and Tyr101 ofEGFR;

(ya-8) structure coordinates an EGF-EGFR binding site, consisting ofatomic coordinates of amino acid residues corresponding to at leastIle23, Leu26, and Lys28 of EGF and Leu14, Tyr45, Leu69, Glu90, and Leu98of EGFR, and atomic coordinates of one or more amino acid residuesselected from amino acid residues corresponding to Met21, Ala25, Cys31,Asn32, and Cys33 of EGF and Gln16, Gly18, Glu35, Ala68, and Tyr101 ofEGFR and amino acid residues adjacent thereto;

-   (yb) structure coordinates of a second EGF-EGFR binding site    selected from any one of the following (yb-1) to (yb-8);

(yb-1) structure coordinates of an EGF-EGFR binding site, comprisingatomic coordinates of amino acid residues corresponding to at leastLeu15 and Arg41 of EGF and Val350 and Asp355 of EGFR;

(yb-2) structure coordinates of an EGF-EGFR binding site, comprisingatomic coordinates of amino acid residues corresponding to at leastLeu15 and Arg41 of EGF and Val350 and Asp355 of EGFR, and atomiccoordinates of one or more amino acid residues selected from amino acidresidues corresponding to His10, Asp11, Tyr13, His16, Cys31, Asn32,Cys33, Ile38, Gly39, and Glu40 of EGF, and Thr10, Asn12, Lys13, Gln16,Leu17, Gly18, Leu27, Leu325, Ser356, Phe357, Gln384, and His409 of EGFR;

(yb-3) structure coordinates of an EGF-EGFR binding site, consisting ofatomic coordinates of amino acid residues corresponding to at leastLeu15 and Arg41 of EGF and Val350 and Asp355 of EGFR, and atomiccoordinates of one or more amino acid residues selected from amino acidresidues corresponding to His10, Asp11, Tyr13, His16, Cys31, Asn32,Cys33, Ile38, Gly39, and Glu40 of EGF and Thr10, Asn12, Lys13, Gln16,Leu17, Gly18, Leu27, Leu325, Ser356, Phe357, Gln384, and His409 of EGFR;

(yb-4) structure coordinates of an EGF-EGFR binding site, consisting ofatomic coordinates of amino acid residues corresponding to at leastLeu15 and Arg41 of EGF and Val350 and Asp355 of EGFR, and atomiccoordinates of one or more amino acid residues selected from amino acidresidues corresponding to His10, Asp11, Tyr13, His16, Cys31, Asn32,Cys33, Ile38, Gly39, and Glu40 of EGF, and Thr10, Asn12, Lys13, Gln16,Leu17, Gly18, Leu27, Leu325, Ser356, Phe357, Gln384, and His409 of EGFRand amino acid residues adjacent thereto;

(yb-5) structure coordinates of an EGF-EGFR binding site, comprisingatomic coordinates of amino acid residues corresponding to at leastTyr13, Leu15, and Arg41 of EGF and Val350, Asp355, and Phe357 of EGFR;

(yb-6) structure coordinates of an EGF-EGFR binding site, comprisingatomic coordinates of amino acid residues corresponding to at leastTyr13, Leu15, and Arg41 of EGF and Val350, Asp355, and Phe357 of EGFR,and atomic coordinates of one or more amino acid residues selected fromamino acid residues corresponding to His10, Asp11, His16, Cys31, Asn32,Cys33, Ile38, Gly39, and Glu40 of EGF and Thr10, Asn12, Lys13, Gln16,Leu17, Gly18, Leu27, Leu325, Ser356, Gln384, and His409 of EGFR;

(yb-7) structure coordinates of an EGF-EGFR binding site, consisting ofatomic coordinates of amino acid residues corresponding to at leastTyr13, Leu15, and Arg41 of EGF and Val350, Asp355, and Phe357 of EGFR,and atomic coordinates of one or more amino acid residues selected fromamino acid residues corresponding to His10, Asp11, His16, Cys31, Asn32,Cys33, Ile38, Gly39, and Glu40 of EGF and Thr10, Asn12, Lys13, Gln16,Leu17, Gly18, Leu27, Leu325, Ser356, Gln384, and His409 of EGFR;

(yb-8) structure coordinates of an EGF-EGFR binding site, consisting ofatomic coordinates of amino acid residues corresponding to at leastTyr13, Leu15, and Arg41 of EGF and Val350, Asp355, and Phe357 of EGFR,and atomic coordinates of one or more amino acid residues selected fromamino acid residues corresponding to His10, Asp11, His16, Cys31, Asn32,Cys33, Ile38, Gly39, and Glu40 of EGF and Thr10, Asn12, Lys13, Gln16,Leu17, Gly18, Leu27, Leu325, Ser356, Gln384, and His409 of EGFR andamino acid residues adjacent thereto;

-   (yc) structure coordinates of a third EGF-EGFR binding site selected    from any one of the following (yc-1) to (yc-8);

(yc-1) structure coordinates of an EGF-EGFR binding site, comprising theatomic coordinates of amino acid residues corresponding to at leastArg45 and Leu47 of EGF and Leu382, Gln384, Phe412, and Ile438 of EGFR;

(yc-2) structure coordinates of an EGF-EGFR binding site, comprisingatomic coordinates of amino acid residues corresponding to at leastArg45 and Leu47 of EGF and Leu382, Gln384, Phe412, and Ile438 of EGFR,and atomic coordinates of one or more amino acid residues selected fromamino acid residues corresponding to Gln43, Tyr44, Asp46, and Lys48 ofEGF, and Arg29, Leu325, His346, Gln408, Gln411, Ala415, and Val417 ofEGFR;

(yc-3) structure coordinates of an EGF-EGFR binding site, consisting ofatomic coordinates of amino acid residues corresponding to at leastArg45 and Leu47 of EGF and Leu382, Gln384, Phe412, and Ile438 of EGFR,and atomic coordinates of one or more amino acid residues selected fromamino acid residues corresponding to Gln43, Tyr44, Asp46, and Lys48 ofEGF and Arg29, Leu325, His346, Gln408, Gln411, Ala415, and Val417 ofEGFR;

(yc-4) structure coordinates of an EGF-EGFR binding site, consisting ofatomic coordinates of amino acid residues corresponding to at leastArg45 and Leu47 of EGF and Leu382, Gln384, Phe412, and Ile438 of EGFR,and atomic coordinates of one or more amino acid residues selected fromamino acid residues corresponding to Gln43, Tyr44, Asp46, and Lys48 ofEGF and Arg29, Leu325, His346, Gln408, Gln411, Ala415, and Val417 ofEGFR and amino acid residues adjacent thereto;

(yc-5) structure coordinates of an EGF-EGFR binding site, comprisingatomic coordinates of amino acid residues corresponding to at leastGln43, Arg45, and Leu47 of EGF and Leu382, Gln384, Phe412, and Ile438 ofEGFR;

(yc-6) structure coordinates of an EGF-EGFR binding site, comprisingatomic coordinates of amino acid residues corresponding to at leastGln43, Arg45, and Leu47 of EGF and Leu382, Gln384, Phe412, and Ile438 ofEGFR, and atomic coordinates of one or more amino acid residues selectedfrom amino acid residues corresponding to Tyr44, Asp46, and Lys48 of EGFand Arg29, Leu325, His346, Gln408, Gln411, Ala415, and Val417 of EGFR;

(yc-7) structure coordinates of an EGF-EGFR binding site, consisting ofatomic coordinates of amino acid residues corresponding to at leastGln43, Arg45, and Leu47 of EGF and Leu382, Gln384, Phe412, and Ile438 ofEGFR, and atomic coordinates of one or more amino acid residues selectedfrom amino acid residues corresponding to Tyr44, Asp46, and Lys48 of EGFand Arg29, Leu325, His346, Gln408, Gln411, Ala415, and Val417 of EGFR;

(yc-8) structure coordinates an EGF-EGFR binding site, consisting ofatomic coordinates of amino acid residues corresponding to at leastGln43, Arg45, and Leu47 of EGF and Leu382, Gln384, Phe412, and Ile438 ofEGFR, and atomic coordinates of one or more amino acid residues selectedfrom amino acid residues corresponding to Tyr44, Asp46, and Lys48 of EGFand Arg29, Leu325, His346, Gln408, Gln411, Ala415, and Val417 of EGFRand amino acid residues adjacent thereto;

-   (yd) structure coordinates of an EGF-EGFR binding site, comprising    the first EGF-EGFR binding site of (ya) and the second EGF-EGFR    binding site of (yb);-   (ye) structure coordinates of an EGF-EGFR binding site, comprising    the second EGF-EGFR binding site of (yb) and the third EGF-EGFR    binding site of (yc);-   (yf) structure coordinates of an EGF-EGFR binding site, comprising    the first EGF-EGFR binding site of (ya) and the third EGF-EGFR    binding site of (yc);-   (yg) structure coordinates of an EGF-EGFR binding site, comprising    the first EGF-EGFR binding site of (ya), the second EGF-EGFR binding    site of (yb), and the third EGF-EGFR binding site of (yc);-   (yh) structure coordinates of an EGF-EGFR binding site selected from    any one of the following (yh-1) to (yh-3);

(yh-1) structure coordinates of an EGF-EGFR binding site, comprisingatomic coordinates of amino acid residues corresponding to at leastTyr13, Glu40, Arg41, Asp46, and Leu47 of EGF and Phe357, Lys13, Asp355,Arg29, Leu382, Ala415, and Val417 of EGFR;

(yh-2) structure coordinates, consisting of atomic coordinates of aminoacid residues corresponding to His10, Asp11, Tyr13, Leu15, His16, Met21,Ile23, Ala25, Leu26, Lys28, Ala30, Cys31, Asn32, Cys33, Val35, Tyr37,Ile38, Gly39, Glu40, Arg41, Gln43, Tyr44, Arg45, Asp46, Leu47, Lys48 andTrp49 of EGF and Asn12, Lys13, Leu14, Thr15, Gln16, Leu17, Gly18, Asp22,Arg29, Tyr45, Ala68, Leu69, Tyr89, Glu90, Leu98, Ser99, Tyr101, Leu325,His346, Leu348, Pro349, Val350, Asp355, Ser356, Phe357, Thr358, Leu382,Gln384, Gln408, His409, Gln411, Phe412, Ala415, Val417, Ile438 andLys465 of EGFR;

(yh-3) structure coordinates, consisting of atomic coordinates of aminoacid residues corresponding to His10, Asp11, Tyr13, Leu15, His16, Met21,Ile23, Ala25, Leu26, Lys28, Ala30, Cys31, Asn32, Cys33, Val35, Tyr37,Ile38, Gly39, Glu40, Arg41, Gln43, Tyr44, Arg45, Asp46, Leu47, Lys48 andTrp49 of EGF and Asn12, Lys13, Leu14, Thr15, Gln16, Leu17, Gly18, Asp22,Arg29, Tyr45, Ala68, Leu69, Tyr89, Glu90, Leu98, Ser99, Tyr101, Leu325,His346, Leu348, Pro349, Val350, Asp355, Ser356, Phe357, Thr358, Leu382,Gln384, Gln408, His409, Gln411, Phe412, Ala415, Val417, Ile438 andLys465 of EGFR and amino acid residues adjacent thereto;

-   (yi) structure coordinates of an EGF-EGFR binding site;-   (yj) structure coordinates of an EGFR dimerization site, comprising    atomic coordinates of amino acid residues corresponding to at least    Thr249, Tyr246, and Gln252 of the 1^(st) EGFR protein and Asn86,    Cys283, and Ala286 of the 2^(nd) EGFR protein, where the two EGFR    proteins form a dimer;-   (yk) structure coordinates of an EGFR dimerization site, comprising    atomic coordinates of amino acid residues corresponding to at least    Thr249, Tyr246, and Gln252 of the 1^(st) EGFR protein and Asn86,    Cys283, and Ala286 of the 2^(nd) EGFR protein, where the two EGFR    proteins form a dimer, and atomic coordinates of one or more amino    acid residues selected from amino acid residues corresponding to    Asn86, Gln194, Pro204, Ser205, Lys229, Phe230, Thr239, Pro242,    Tyr246, Pro248, Thr249, Tyr251, Gln252, Met253, Ser262, Phe263,    Gly264, Ala265, Tyr275, His280, Ser282, Cys283, Val284, Arg285,    Ala286, and Lys303 of both EGFR proteins;-   (yl) structure coordinates of an EGFR dimerization site, consisting    of atomic coordinates of amino acid residues corresponding to Asn86,    Gln194, Pro204, Ser205, Lys229, Phe230, Thr239, Pro242, Tyr246,    Pro248, Thr249, Tyr251, Gln252, Met253, Ser262, Phe263, Gly264,    Ala265, Tyr275, His280, Ser282, Cys283, Val284, Arg285, Ala286, and    Lys303 of both EGFR proteins forming a dimer;-   (ym) structure coordinates, consisting of atomic coordinates of    amino acid residues corresponding to Asn86, Gln194, Pro204, Ser205,    Lys229, Phe230, Thr239, Pro242, Tyr246, Pro248, Thr249, Tyr251,    Gln252, Met253, Ser262, Phe263, Gly264, Ala265, Tyr275, His280,    Ser282, Cys283, Val284, Arg285, Ala286, and Lys303 and amino acid    residues adjacent thereto of both EGFR proteins forming a dimer;-   (yn) structure coordinates of an EGFR dimerization site, comprising    atomic coordinates of amino acid residues corresponding to at least    Arg405 and Glu293 of EGFR, and atomic coordinates of one or more    amino acid residues selected from amino acid residues corresponding    to Arg285, Arg273, Asp254, and Gly458 of EGFR;-   (yo) structure coordinates of an EGFR dimerization site, consisting    of atomic coordinates of amino acid residues corresponding to at    least Arg405 and Glu293 of EGFR, and atomic coordinates of one or    more amino acid residues selected from amino acid residues    corresponding to Arg285, Arg273, Asp254, and Gly458 of EGFR;-   (yp) structure coordinates, consisting of atomic coordinates of    amino acid residues corresponding to at least Arg405 and Glu293 of    EGFR, and atomic coordinates of one or more amino acid residues    selected from the amino acid residues corresponding to Arg285,    Arg273, Asp254, and Gly458 of EGFR and amino acid residues adjacent    thereto;-   (yq) structure coordinates of an EGFR dimerization site;-   (za) structure coordinates of a ligand-receptor binding site, which    are characterized in that the root mean square deviation of α carbon    atomic coordinates is 1.5 Å or less when the structure coordinates    are superimposed on those of the amino acid residues composing the    EGF-EGFR binding site of any one of (ya) to (yi);-   (zb) structure coordinates of a receptor dimerization site, which    are characterized in that the root mean square deviation of α carbon    atomic coordinates is 1.5 Å or less when the structure coordinates    are superimposed on those of the amino acid residues composing the    EGFR dimerization site of any one of (yj) to (yq);-   (zc) structure coordinates of the EGF-EGFR complex or a structure    homologue thereof, comprising the structure coordinates of any one    of (a) to (zb).

In the above explanation of the structure coordinates, “α carbon” meansa carbon atom at position α of an amino acid, and is also denoted as“Cα.” The three-dimensional structure of protein is not firmly fixed,but involves fluctuation to some extent. As long as deviation for thebackbone position between entire protein structures, that is, the rootmean square deviation of α carbon is 2.0 Å or less, the proteins can bethought to have structures substantially equivalent to each other.Preferably, the root mean square deviation is 1.5 Å or less, and furtherpreferably 1.0 Å or less. Furthermore, in the case of the structures ofreceptor dimerization sites or ligand-receptor binding sites, as long asthe root mean square deviation of the backbone atoms is 1.5 Å or less,the sites can be thought to have functionally equivalent structures.Preferably, the root mean square deviation is 1.0 Å or less, morepreferably 0.7 Å or less, and further preferably 0.5 Å or less.

“Some (of structure coordinates)” in (a) or (m) above means structurecoordinates consisting of any one or more of the atomic coordinates ofthe structure coordinates of Table 1 or Table 2. For example, thestructure coordinates in Table 1 or Table 2 contain 2 molecules of EGFand 2 molecules of EGFR, and a portion corresponding to one EGF moleculeor a portion corresponding to one EGFR molecule among these molecules isan appropriate example of “some (of structure coordinates).” Inaddition, the coordinates of an EGF-EGFR binding site and coordinates ofan EGFR dimerization site are also appropriate examples of “some (ofstructure coordinates).” Preferably, at a minimum, some (of structurecoordinates) refer to structure coordinates containing atomiccoordinates of an EGF-EGFR binding site or an EGFR dimerization site. Aslong as “some (of structure coordinates)” contain structure coordinatesof an EGF-EGFR binding site or an EGFR dimerization site, it may alsocontain any one or more of the atomic coordinates in Table 1 or Table 2.Another appropriate example of “some (of structure coordinates)” is aportion corresponding to any of domain I, II, or III of EGFR. Morespecific examples of “some (of structure coordinates)” are shown in thestructure coordinates (b) to (l), (O) to (x), and (ya) to (yq).

Appropriate examples of the structure coordinates of an EGFRdimerization site are structure coordinates consisting of atomiccoordinates of amino acid residues corresponding to Asn86, Gln194,Pro204, Ser205, Lys229, Phe230, Thr239, Pro242, Tyr246, Pro248, Thr249,Tyr251, Gln252, Met253, Ser262, Phe263, Gly264, Ala265, Tyr275, His280,Ser282, Cys283, Val284, Arg285, Ala286, and Lys303 of both EGFR proteinsforming a dimer. The structure coordinates of the same may also bestructure coordinates consisting of atomic coordinates of the aboveamino acids of one of the two EGFR proteins forming a dimer. Moreover,at least, as long as amino acids creating an interaction important indimerization are contained, some of the above amino acids, or one ormore amino acids other than the amino acids listed above, may becontained. “Correspond to” means that given amino acid numberscorrespond to the amino acid sequence shown in SEQ ID NO: 1 regardingthe amino acid residues of EGFR, and regarding the amino acid residuesof EGF, given amino acid numbers correspond to the amino acid sequenceshown in SEQ ID NO: 2.

Examples of amino acids creating an interaction important indimerization include Thr249, Tyr246, and Gln252 of the 1st EGFR andAsn86, Cys283, and Ala286 of the 2^(nd) EGFR, among the EGFR proteinsforming a dimer. However, since amino acids creating an interactionimportant in dimerization may differ depending on methods employed foranalyzing interfaces, examples of such amino acids are not limited tothe above examples. As a preferred example, it is defined that aminoacids composing a dimerization site may contain any one or more aminoacids, as long as they contain amino acid residues corresponding to atleast Thr249, Tyr246, and Gln252 of the 1^(st) EGFR, and Asn86, Cys283,and Ala286 of the 2^(nd) EGFR, where the two EGFR proteins form a dimer.Preferred specific examples thereof are shown in the structurecoordinates (yk) to (ym). Moreover, residues (Glu293, Arg285, Arg405,Arg273, Asp254, and Gly458) that do not participate in directinteraction in dimerization, but in changeover of amino acid interactionlike a switch when a ligand-dependent conformational change takes place,can also be said to be amino acids creating an interaction important indimerization. Examples of a dimerization site consisting of such aminoacid residues are shown in the structure coordinates (yn) to (yp).Furthermore, “amino acid residues adjacent thereto” means amino acidresidues that are present at a distance of within 5 Å, and preferablywithin 3 Å, from other amino acid residues. The “receptor dimerizationsite” of (zb) includes an EGFR dimerization site, but is not limitedthereto. Sites important for the exertion of protein functions, such asa dimerization site, may be conserved structurally at a high levelregardless of differences among families or species. Hence,ligand-receptor complex that form a complex with a mechanism similar tothat for EGF/EGFR are thought to have structures equivalent to those ofthe EGFR dimerization sites of the present invention. Therefore, thepresent invention also provides the coordinates of receptor dimerizationsites of such ligand-receptor complexes that form a complex with amechanism similar to that of EGF/EGFR. Examples of such ligand-receptorcomplexes include receptors belonging to the ErbB family.

Appropriate examples of the structure coordinates of an EGF-EGFR bindingsite are structure coordinates consisting of atomic coordinates of aminoacid residues corresponding to the following (A) and (B):

(A) His10, Asp11, Tyr13, Leu15, His16, Met21, Ile23, Ala25, Leu26,Lys28, Ala30, Cys31, Asn32, Cys33, Val35, Tyr37, Ile38, Gly39, Glu40,Arg41, Gln43, Tyr44, Arg45, Asp46, Leu47, Lys48, and Trp49 of EGF; and

(B) Asn12, Lys13, Leu14, Thr15, Gln16, Leu17, Gly18, Asp22, Arg29,Tyr45, Ala68, Leu69, Tyr89, Glu90, Leu98, Ser99, Tyr101, Leu325, His346,Leu348, Pro349, Val350, Asp355, Ser356, Phe357, Thr358, Leu382, Gln384,Gln408, His409, Gln411, Phe412, Ala415, Val417, Ile438, and Lys465 ofEGFR. Moreover, at least, as long as amino acids creating an interactionimportant in EGF-EGFR binding are contained, some of the amino acidslisted above, or one or more amino acids other than the amino acidslisted above, may be contained. Furthermore, since sugar chains or watermolecules other than amino acids may play an important role in aninteraction, the structure coordinates of an EGF-EGFR complex or bindingsites may appropriately contain atomic coordinates of such sugar chainsor water molecules.

Examples of amino acids creating an interaction important in EGF-EGFRbinding include Tyr13, Glu40, Arg41, Asp46, and Leu47 of EGF and Phe357,Lys13, Asp355, Arg29, Leu382, Ala415, and Val417 of EGFR as shown inFIG. 9. However, since amino acids creating an interaction important inEGF-EGFR binding may differ depending on methods employed for analyzinginterfaces, examples of such amino acids are not limited to the aboveamino acids. As a preferred example, it is defined that amino acidscomposing an EGF-EGFR interaction site may contain any one or more aminoacids, as long as they contain a set of amino acid residues of EGF andEGFR listed in any one of structure coordinates (ya-1), (yb-1), and(yc-1). Preferred specific examples thereof are shown in the structurecoordinates (ya) to (yh).

Furthermore, “amino acid residues adjacent thereto” means amino acidresidues that are present at a distance of within 5 Å, and preferablywithin 3 Å, from other amino acid residues. The “ligand-receptor bindingsite” of (za) includes an EGF-EGFR binding site, but is not limitedthereto. Sites important for the exertion of protein functions, such asa ligand-receptor binding site, may be conserved structurally at a highlevel regardless of differences among families or species. Hence,ligand-receptor complexes that form a complex with a mechanism similarto that for EGF/EGFR are thought to have structures equivalent to thoseof the EGF-EGFR binding sites of the present invention. Therefore, thepresent invention also provides the coordinates of ligand-receptorbinding sites of such ligand-receptor complexes that form a complex witha mechanism similar to that of EGF/EGFR. Examples of suchligand-receptor complexes include receptors belonging to the ErbBfamily.

The “structure homologue” of (zc) refers to a ligand-receptor complexthat is structurally analogous to an EGF-EGFR complex. Beingstructurally analogous refers to having structure similarity in at leasta receptor dimerization site and/or a ligand-receptor binding site.“Having similarity” means to have a structure which is specified bystructure coordinates characterized in that a root mean square deviationof α carbon atomic coordinates is 1.5 Å or less when the structurecoordinates are superimposed on those of a peptide chain of the EGF-EGFRcomplex. Examples of structure homologues include ligand-receptorcomplexes that form a complex with a mechanism similar to that forEGF/EGFR and are receptors belonging to the ErbB family.

For the structure coordinates of EGF-EGFR binding sites or EGFRdimerization sites specified by the amino acid residues (ya) to (yq),the structure coordinates in the form of amino acid coordinates shown inTable 1 or Table 2 can be referred, but the coordinates are not limitedthereto. For example, structure coordinates of an EGF-EGFR complex,which are characterized in that a root mean square deviation of α carbonatomic coordinates is 2.0 Å or less when the structure coordinates aresuperimposed on those of the EGF-EGFR complex specified by the structurecoordinates shown in Table 1 or Table 2 can be referred to. Moreover,structure coordinates of an EGF-EGFR complex that satisfy at least onefeature of those described in the following (1) to (4) can also bereferred:

-   (1) amino acid residues corresponding to Leu14, Tyr45, Leu69, Glu90,    and Leu98 of EGFR are present at a distance from amino acid residues    corresponding to Ile23, Leu26, and Lys28 of EGF so as to be able to    interact with each other;-   (2) amino acid residues corresponding to Val350, Asp355, and Phe357    of EGFR are present at a distance from amino acid residues    corresponding to Tyr13, Leu15, and Arg41 of EGF so as to be able to    interact with each other;-   (3) amino acid residues corresponding to Leu382, Gln384, Phe412, and    Ile438 of EGFR are present at a distance from amino acid residues    corresponding to Gln43, Arg45, and Leu47 of EGF so as to be able to    interact with each other;-   (4) amino acid residues corresponding to Asn86, Cys283, and Ala286    of a second EGFR protein are present at a distance from amino acid    residues corresponding to Thr249, Tyr246, and Gln252 of a first EGFR    protein where the two EGFR proteins form a dimer, so as to be able    to interact with each other.

Here, “a distance . . . so as to be able to interact with each other” iswithin 5 Å, and preferably within 3 Å.

Furthermore, the three-dimensional structure of protein is defined bythe relative spatial arrangement of atoms composing the structure. Thus,generation of structure coordinates is a convenient process necessaryfor handling the data of the three-dimensional structure on a computeror the like. Hence, as is clear to persons skilled in the art, structurecoordinates obtained by rotating the structure coordinates (a) to (zc)and/or subjecting the same coordinates to translational operation alsocompletely represent the same three-dimensional structure as thatrepresented by the structure coordinates prior to such operation.

Moreover, the present invention provides a computer-readable storagemedium (recording medium) wherein any of the above structure coordinates(a) to (zc) has been stored (recorded). Examples of thecomputer-readable storage medium are not specifically limited, as longas it can introduce the stored structure coordinates into variousprograms (e.g., a program utilizing structure coordinates) on acomputer. For example, it may be an electric temporary storage mediumreferred to as a memory, or a semipermanent storage medium such as afloppy disk, a hard disk, an optical disk, a magneto-optic disk, or amagnetic tape.

The structure coordinates and the storage medium storing the structurecoordinates of the present invention are useful, because they can beused for screening for or designing compounds having action to regulateEGFR activity (in this specification, also referred to as “substancesregulating EGFR activity”). Furthermore, the present invention providesEGF/EGFR or an EGF-EGFR complex having a three-dimensional structurethat is characterized by any one of the above structure coordinates (a)to (z).

3. Screening Method or Designing Method Using Structure Information

A method for screening for or designing a substance regulating EGFRactivity using structure coordinates of an EGF-EGFR complex according tothe present invention, comprises the following steps of:

-   (a) generating structure coordinates of a three-dimensional    structure of a test substance; and-   (b) superimposing the structure coordinates of (a) onto all or some    of the structure coordinates of the EGF-EGFR complex in the same    coordinate system so as to evaluate their state of fitting.    Specifically, the method of the present invention involves fitting    the above structure coordinates of the EGF-EGFR complex with    structure coordinates representing the three-dimensional structure    of any test substance on a computer, numerically expressing their    state of fitting using, for example, empirical scoring functions as    indices, and then evaluating the ability of the test substance to    bind to EGFR and/or EGF. Here, as the structure coordinates of the    EGF-EGFR complex, all the structure coordinates or some of the    structure coordinates of an EGF-EGFR binding site, an EGFR    dimerization site or the like can be utilized.

The method of the present invention can further comprise a step ofsubjecting a screened or designed substance regulating EGFR activity tobiochemical assay so as to evaluate action to regulate EGFR activity.

In this specification, a substance regulating EGFR activity is referredto as an EGFR agonist or an EGFR antagonist. The EGFR agonist isreferred to as a substance at least having activity to bind to EGFR, andpreferably having activity to induce EGFR dimerization by its binding.The EGFR antagonist is referred to as a substance having activity toinhibit EGF-EGFR binding and/or activity to inhibit EGFR dimerization.Substances inhibiting EGF-EGFR binding by binding to EGF are alsoincluded in the examples of the EGFR antagonist. Substances havingactivity to promote or stabilize EGF-EGFR binding by binding to EGF soas to induce EGFR dimerization are included in the examples of the EGFRagonist. A preferred example of such an EGFR agonist is EGF.

The activity of the substance regulating EGFR activity to induce orinhibit EGFR dimerization (specifically, action regulating EGFRactivity) can be confirmed by directly observing the success or failureof EGFR dimerization, and can also be confirmed by a method fordetecting the success or failure of the process or the result (e.g.,phosphorylation in the intracellular regions of EGFR and cellproliferation) of signal transduction that takes place by dimerization.

Examples of a test substance to be subjected to the screening method ofthe present invention include, but are not limited to, proteins,peptides, oligonucleotides, synthetic compounds, compounds derived fromnature, fermentation products, cell extracts, plant extracts, and animaltissue extracts, and may be either novel substances or known substances.

3-1 Method for Identifying EGF-EGFR Binding Site and EGFR DimerizationSite

The present invention provides a method for identifying an EGF-EGFRbinding site or an EGFR dimerization site in an EGF-EGFR complex usingstructure coordinates of the EGF-EGFR complex.

All the information regarding the structure coordinates of the EGF-EGFRcomplex is useful in the present invention. Extraction of particularlyimportant information from the entirety of information results in muchwider possibilities for the industrial applications thereof. Analysis ofthe crystal structure coordinates of the present invention makes itpossible to specify amino acid residues at which EGF interacts with EGFR(EGF-EGFR binding site) and amino acid residues interacting upon EGFRdimerization (EGFR dimerization site). Moreover, it is also possible toextract properties or information particularly important in industrialapplications by visually displaying molecular coordinates of the complexusing existing molecular design software (e.g., Tripos, Inc.'s SYBYL™and Acceirys Inc.'s InsightII™) and then extracting amino acid residuesinvolved in the interaction.

Specifically, the method can comprise the steps of entering thestructure coordinates of the EGF-EGFR complex into a computer, andspecifying amino acid residues composing an EGF-EGFR binding site or anEGFR dimerization site through the analysis of the structure of theEGF-EGFR complex. Furthermore, the method may also comprise a step ofvisually displaying the three-dimensional structure of the EGF-EGFRcomplex on a computer. The step of specifying amino acid residuescomposing an EGF-EGFR binding site or an EGFR dimerization site throughthe analysis of the structure of the EGF-EGFR complex can be realized byanalysis using visual observation and/or a computer program.

The structure coordinates obtained from the crystal of the EGF-EGFRcomplex according to the present invention are entered into a computeror a storage medium of the computer on which a computer program runs forexpressing the three-dimensional structure coordinates of molecules,making it possible to express in detail the patterns ofthree-dimensional chemical interactions of the EGF-EGFR complex.Examples of the storage medium of the computer are not specificallylimited, as long as they can introduce the structure coordinatesobtained from a crystal of the EGF-EGFR complex to the program in thecomputer. For example, it may be an electric temporary storage mediumreferred to as a memory or a semipermanent storage medium such as afloppy disk, a hard disk, an optical disk, a magneto-optical disk, or amagnetic tape. Many computer programs for expressing thethree-dimensional structure coordinates of protein molecules arecommercially available, and these programs generally provide, forexample, a means of entering the three-dimensional structure coordinatesof molecules, a means of visually expressing the coordinates on acomputer screen, a means of measuring each distance, bond angle, and thelike between atoms within an expressed molecule, and a means of carryingout additional correction for the coordinates. Furthermore, it is alsopossible to use a program that is produced to be able to provide a meansof calculating molecular structural energy based on the coordinates of amolecule, and a means of calculating free energy considering a solventmolecule such as a water molecule. InsightII™ and QUANTA™, which arecomputer programs marketed by Accelrys Inc., are examples of appropriateprograms for this purpose, but the examples in the present invention arenot limited to these programs. Moreover, the program is generallyintroduced into a computer referred to as workstation supplied bySilicon Graphics Inc., or Sun Microsystems Inc., and then used, butexamples of a computer are not limited thereto. As structurecoordinates, the structure coordinates (a) to (x) explained in section“2. Structure coordinates of EGF-EGFR complex” can be used.

The step of specifying amino acid residues composing an EGF-EGFR bindingsite or an EGFR dimerization site by visual observation and/or acomputer program is a step of specifying amino acid residuesparticipating in protein-protein interaction by analyzing the interfacebetween protein molecules by visual observation and/or use of a computerprogram. At this time, amino acids participating in interaction arespecified considering distance, strength, type, and the like regardinginteraction between amino acid residues. For example, in the crystalstructure of the complex, when amino acid residues of EGFR that arepresent at a distance of within several angstroms from each amino acidresidue of EGF are extracted, amino acid residues on EGFR directlyinteracting with EGF can be extracted. Conversely, when amino acidresidues in EGF that are present at a distance of within severalangstroms from each amino acid residue of EGFR are extracted, amino acidresidues on EGF directly interacting with EGFR can be specified.Furthermore, extraction of amino acid residues in EGFR that are presentat a distance of within several angstroms from another EGFR protein,where the EGFR proteins form a dimer, makes it possible to specify aminoacid residues important in EGFR dimerization.

For example, amino acid residues of EGF that are present at a distanceof within 3 Å from EGFR are important in direct binding with EGFR.Conversely, amino acid residues in EGFR that are present at a distanceof within 3 Å from EGF are important in direct interaction with the EGFmolecule. Furthermore, amino acid residues in EGFR that are present at adistance of within 3 Å from another EGFR protein, where the EGFRproteins form a dimer, are important for EGFR dimerization.

Examples of interaction types include electrostatic interaction,hydrophobic interaction, Van der Waals interaction, and hydrogen bonds.In addition, not only simple specification of interaction based ondistance, but also specification of interaction considering theorientation of amino acid side chains corresponding to each other ispreferred. The strength of interaction is affected by distance,interaction type, orientation of side chains, the presence of watermolecules, and the like. When analysis is made by a computer program, itis possible to use a program calculating the distance between amino acidresidues, a program specifying the interaction type based on the typesof two amino acid residues inferred to create interaction, or the like.However, when a plurality of amino acid residues capable of interactingwith each other are present in close proximity, it may be difficult todetermine which interactions of which residues are important inprotein-protein interaction by calculation only. Hence, it is preferredto finally specify amino acids participating in interaction consideringcomprehensibly distance, strength, type, and the like regardinginteraction between amino acid residues by visual observation.

The amino acid residues shown below are useful for analyzing interactionthat is specified from the crystal structure analysis of the EGF-EGFRcomplex based on the distance for interaction.

-   (i) Examples of amino acid residues in EGF composing a binding site    with EGFR His10, Asp11, Tyr13, Leu15, His16, Met21, Ile23, Ala25,    Leu26, Lys28, Ala30, Cys31, Asn32, Cys33, Val35, Tyr37, Ile38,    Gly39, Glu40, Arg41, Gln43, Tyr44, Arg45, Asp46, Leu47, Lys48, and    Trp49-   (ii) Examples of amino acid residues in EGFR composing a binding    site with EGF Asn12, Lys13, Leu14, Thr15, Gln16, Leu17, Gly18,    Asp22, Arg29, Tyr45, Ala68, Leu69, Tyr89, Glu90, Leu98, Ser99,    Tyr101, Leu325, His346, Leu348, Pro349, Val350, Asp355, Ser356,    Phe357, Thr358, Leu382, Gln384, Gln408, His409, Gln411, Phe412,    Ala415, Val417, Ile438, and Lys465-   (iii) Examples of amino acid residues in EGFR forming an EGFR    dimerization site    Asn86, Gln194, Pro204, Ser205, Lys229, Phe230, Thr239, Pro242,    Tyr246, Pro248, Thr249, Tyr251, Gln252, Met253, Ser262, Phe263,    Gly264, Ala265, Tyr275, His280, Ser282, Cys283, Val284, Arg285,    Ala286, and Lys303

Examples of EGF-EGFR binding sites and EGFR dimerization sites are notlimited to the above examples.

As described above, an interface can also be defined as being composedof amino acid residues in EGFR that are present at a distance of within3 Å from EGF. In addition, only amino acids that are important ininteraction can also be defined as composing an EGF-EGFR binding site oran EGFR dimerization site while focusing on interaction type andstrength. Specifically, results may differ depending on the method usedfor analyzing an interface. However, as long as the screening methodaccording to the present invention can still be implemented, suchinteraction is encompassed in the scope of the present invention.Examples of extraction of amino acids important in interactionconsidering not only the distance between amino acid residues forinteraction, but also interaction type and side chain orientation, areexplained in FIGS. 9 and 10, Example 6, and the above section “2.Structure coordinates of EGF-EGFR complex.”

3-2 Identification of EGF-EGFR Binding Site and Dimerization Site

Through the use of the method for identifying EGF-EGFR binding sites orEGFR dimerization sites in an EGF-EGFR complex using the structurecoordinates of the EGF-EGFR complex, the present invention furtherprovides the following methods:

I. a method for screening for a substance regulating EGFR activity,comprising the following steps of:

-   (1) identifying an EGF-EGFR binding site or an EGFR dimerization    site using the structure coordinates of the EGF-EGFR complex;-   (2) screening for a candidate substance regulating EGFR activity    using the structure coordinates of the EGF-EGFR binding site or the    EGFR dimerization site identified by (1); and-   (3) subjecting the substance regulating EGFR activity obtained    by (2) to biochemical assay so as to evaluate the action regulating    EGFR activity.    II. a method for screening for a substance regulating EGFR activity,    comprising the following steps of:-   (1) identifying an EGF-EGFR binding site or an EGFR dimerization    site using the structure coordinates of the EGF-EGFR complex;-   (2) designing a pharmacophore of a substance regulating EGFR    activity using the structure coordinates of the EGF-EGFR binding    site or the EGFR dimerization site identified by (1);-   (3) screening for the substance regulating EGFR activity using the    pharmacophore obtained by (2); and-   (4) subjecting the substance regulating EGFR activity obtained    by (3) to biochemical assay so as to evaluate the action regulating    EGFR activity.    III. a method for designing a substance regulating EGFR activity,    comprising the following steps of:-   (1) identifying an EGF-EGFR binding site or an EGFR dimerization    site using the structure coordinates of the EGF-EGFR complex;-   (2) designing a pharmacophore of a substance regulating EGFR    activity using the structure coordinates of the EGF-EGFR binding    site or the EGFR dimerization site identified by (1);-   (3) designing a compound using the pharmacophore obtained by (2);    and-   (4) subjecting the compound obtained by (3) to biochemical assay so    as to evaluate the action regulating EGFR activity.

These methods are explained in detail below.

3-3 Screening Method Using Structure Coordinates of EGF-EGFR Complex

A method for screening for a substance regulating EGFR activity usingstructure coordinates of an EGF-EGFR complex according to the presentinvention comprises the following steps of:

-   (a) generating structure coordinates of a three-dimensional    structure of a test substance; and-   (b) superimposing the structure coordinates of (a) onto all or some    of the structure coordinates of an EGF-EGFR complex in the same    coordinate system so as to evaluate their state of fitting.    Specifically, such a method involves fitting the above structure    coordinates of the EGF-EGFR complex to structure coordinates    representing a three-dimensional structure of any test substance on    a computer, expressing their state of fitting numerically using, for    example, empirical scoring functions as indices, and then evaluating    the binding ability of the test substance to EGFR and/or EGF.

As described above, the structure coordinates of the EGF-EGFR complexare used, the shape of an EGF-EGFR binding site or an EGFR dimerizationsite is assigned, and then a compound that can bind to the site can besubjected to computer screening using commercial package software suchas DOCK (Ewing, T. J. et al., “DOCK 4.0: Search Strategies for AutomatedMolecular Docking of Flexible Molecule Database,” J. COMP. AIDED MOL.DES. 15(5):411-428 (2001)), AutoDock (Morris, G. M., et al., “AutomatedDocking Using a Lamarckian Genetic Algorithm and Empirical Binding FreeEnergy Function,” J. COMPUTATIONAL CHEM. 19:1639-1662 (1998)), Ludi™, orLigandFit™. For example, amino acid residues in EGFR that can interactwith EGF (Asn12, Lys13, Leu14, Thr15, Gln16, Leu17, Gly18, Asp22, Arg29,Tyr45, Ala68, Leu69, Tyr89, Glu90, Leu98, Ser99, Tyr101, Leu325, His346,Leu348, Pro349, Val350, Asp355, Ser356, Phe357, Thr358, Leu382, Gln384,Gln408, His409, Gln411, Phe412, Ala415, Val417, Ile438, and Lys465)provide pockets or clefts to which substances regulating EGFR activity(agonist or antagonist) can bind. Thus, it becomes possible to conductcomputer screening using such sites as an aid.

The step of superimposing structure coordinates of a test substance ontoall or some of the structure coordinates of the EGF-EGFR complex in thesame coordinate system so as to evaluate their state of fitting can beimplemented using the above-mentioned commercial packaging software anda computer system on which the software can run. The computer systemappropriately comprises various means necessary for running targetsoftware such as a storage means for storing structural formulae ofcompounds, a means for generating coordinates of three-dimensionalstructures of compounds, a storage means for storing structurecoordinates of compounds, a storage means for storing the structurecoordinates of the EGF-EGFR complex, a storage means for storingevaluation results, a means for displaying the contents of each storagemeans, a means for data entry such as a keyboard, a means fordisplaying, such as a display, and a central processing unit.

In this specification, a specific example using DOCK as software foranalysis is shown. Any software may be used, as long as it makes asimulation of the docking procedure of a ligand to a protein possible ona computer. For example, software programs such as FlexX™ (Tripos,Inc.), LigandFit™ (Accelrys Inc.), or Ludi™ (Accelrys Inc.) can be used.

First, a virtual spherical body referred to as a sphere is disposedusing a SPHGEN program in the periphery of a pocket and a cleft to whicha substance regulating EGFR activity (agonist or antagonist) is thoughtto be able to bind. This sphere functions as an anchor for docking of aligand. In addition, sites at which spheres are generated can be limitedto specific pockets or specific clefts, or spheres can be generated at aplurality of sites. When too many spheres are generated, adjacentspheres can be manually removed.

Next, grids are generated at a portion and the periphery thereof whereEGF interacts with EGFR using a GRID program, so as to express anelectronic and steric environment for receptor residues within anassigned range as a scalar value on each grid. In addition, the forcefield of AMBER™ (Pearlman, D., et al., “Amber, a Computer Program forApplying Molecular Mechanics, Normal Mode Analysis, Molecular Dynamicsand Free Energy Calculation to Elucidate the Structures and Energies ofMolecules,” COMP. PHYS. COMMUN. 91:1-41 (1995)) or the like is utilizedto calculate each grid value, but is not always limited to AMBER™ andother force fields may also be used. Furthermore, depending on theprotein side shape, adjustment can also be made by altering gridinformation so as to express docking of a compound in a more realisticform.

Next, a search is conducted on compound database. Using the DOCKprogram, a compound that is located in the vicinity of spheres existingin the periphery of a pocket and a cleft and takes a three-dimensionalconformation so as not to repel steric elements or electronic elementson the grids is searched for. At this time, the three dimensionalconformation of the docked compound is optimized by aconformation-generating function integrated in the DOCK program.However, whether or not appropriate docking is finally conducted iscomprehensively determined based on empirical judgment made using scoresat the time of docking, visual observation, and the like. In thismanner, a series of selected compound groups that were judged to be ableto appropriately conduct docking can be considered as substancesregulating EGFR activity (agonist or antagonist) at a certainprobability.

The above method promotes more efficient drug discovery and drugdevelopment. Specifically, predicting the arrangement of structurecoordinates that fit the properties and shapes of the interaction sitesof an EGF-EGFR complex and the selection by calculation of a compoundhaving a structure capable of agreeing with the putative structurecoordinates make it possible to efficiently select anactivity-controlling substance specific to EGFR from among manycompounds.

A candidate substance (test substance) to be confirmed whether or notthe substance has activity-regulating action for EGFR may be either aknown or a novel substance. The structure, origin, physical properties,and the like thereof are not specifically limited. Such a candidatesubstance may be any of a native compound, a synthetic compound, a highmolecular weight compound, a low molecular weight compound, a peptide,or a nucleic acid analogue. In terms of future pharmaceuticaldevelopment, a low molecular weight compound is preferred. For example,compound information registered with the Available Chemicals Directory(ACD) (MDL information systems, Inc., San Leandro, Calif.), CMC(Comprehensive Medical Chemistry), or MDDR (MDL Drug Data Report, or thelike) is beneficial.

As a program to generate three-dimensional structure coordinates of suchlow molecular weight compounds, programs such as CORINA™ (MolecularNetworks GmbH), Concord™ (Tripos, Inc.), Converter™, or the like can beutilized. The thus generated coordinates of a low molecular weightcompound and those of the EGF-EGFR complex can be automatically boundusing a molecular docking package such as DOCK or the like, or can beinteractively bound using software for displaying molecules such asInsightII™. At this time, as indices for evaluating their state offitting using these programs, calculated free energy values, empiricalscoring functions, shape complementarity, and the like, evaluated forthe entire compound-bound complex, can be freely chosen and used. By theuse of the indices, the quality of the binding can be evaluatedobjectively.

In the method for screening for a substance regulating EGFR activityusing the structure coordinates of the EGF-EGFR complex of the presentinvention, as the structure coordinates, structure coordinates of anEGF-EGFR binding site or an EGFR dimerization site identified by themethod described in section “3-1” can be used.

Examples of the structure coordinates of the EGF-EGFR binding site orthe EGFR dimerization site include the structure coordinates of each(ya) to (yh) and (yj) to (yp) explained in section “2. Structurecoordinates of EGF-EGFR complex.” Furthermore, through the use ofstructure coordinates of a ligand-receptor complex obtained using thestructure coordinates of the EGF-EGFR complex by a later-describedhomology modeling method or the molecular replacement method, screeningof the receptor agonist and the receptor antagonist can be conducted bya similar method.

3-4 Method for Screening for and Method for Designing SubstanceRegulating EGF Activity Using Pharmacophores

A pharmacophore is a representation of physicochemical features of acompound, which is required for binding with a target protein. Apharmacophore represents structural and physicochemical features of acompound as spheres representing pharmacophoric features, so that it canbe defined by determining relative distances among spheres representingpharmacophoric features or can be defined by determining relativedistances among specific functional groups. Furthermore, a pharmacophorecan be defined freely using techniques that can be generally employed bypersons skilled in the art, and the method therefor is not limited tothe above methods.

A sphere representing pharmacophoric features means a spatial regionholding various physicochemical properties including hydrophobicity,electrostatic property, and capability of forming hydrogen bonds. Forexample, according to Catalyst™, which is a pharmacophore constructionprogram (Accelrys Inc., San Diego, Calif.), eight types of spheresrepresenting pharmacophoric features are shown: “Hydrogen-bond Acceptor(furthermore, Hydrogen-bond Acceptor lipid can also be classified),”“Hydrogen-bond Donor,” “Hydrophobic (furthermore, Hydrophobic Aromaticand Hydrophobic aliphatic can also be classified),” “Negative Charge,”“Negative Ionizable,” “Positive Charge,” “Positive Ionizable,” and “RingAromatic.” Users of the program can add new definitions, and can utilizethose other than the above components. Specifically, having ahydrophobic region, a hydrogen bond receptor region, a positive ionregion, a ring aromatic region, and the like are specified asphysicochemical properties. Spheres representing pharmacophoric featurescan be expressed as a spherical region with Å radius having thesephysicochemical properties. Examples of atoms and functional groupsfitting each sphere representing pharmacophoric features are defined inthe manual attached to a program such as Catalyst™ (Accelrys Inc.,Catalyst Documentation Release 4.5, 1999).

In other words, the substance regulating EGFR activity of the presentinvention is specified as a substance that can selectively fitprotein-protein interaction sites (including a ligand-receptor interfaceand a receptor-receptor interface) of the EGF-EGFR complex, and isrepresented by a chemical structure that satisfies the structurecoordinates of spheres having certain physicochemical properties andcorresponding to each other.

By analyzing the structure information of an EGF-EGFR co-crystal andspecifying properties provided by the steric arrangement of thepreviously described amino acid residues or the three-dimensionalarrangement of structural water as a pharmacophore, a substanceregulating EGFR activity (agonist or antagonist) can be screened for bya computer. Moreover, to form a complex, water molecules (structuralwater) existing between molecules may play a role in the formation of acomplex, and protein-protein interaction mediated by such watermolecules is specified by graphics observation or the like. Furthermore,among amino acid residues and water molecules which can participate ininteraction, specifically, the analysis of sites forming hydrophobicinteraction, ionic bonds, hydrogen bonds, amino acid residues, andmolecular shapes (pockets and clefts) provided by the activeconformation of EGF and EGFR in the structure of the complex make itpossible to present pharmacophores required for molecular design andcomputer screening. Examples of interactions of the EGF-EGFR complexesuseful for the construction of pharmacophores are shown in FIGS. 9 and10. The amino acid residues shown in these figures are examples of theamino acids that create important interaction upon EGF-EGFR binding andEGFR dimerization, respectively.

3-4-1 Method for Designing Pharmacophores

The present invention provides a method for designing a pharmacophore,which uses structure coordinates of an EGF-EGFR complex. Here,“pharmacophore” means the pharmacophore of a substance regulating EGFRactivity. The method for designing a pharmacophore can comprise thesteps of analyzing a three-dimensional structure represented by thestructure coordinates of the EGF-EGFR complex (by visual observationand/or an appropriate computer program), specifying a partial structure(e.g., amino acid residues, structural water, pockets, and clefts) thatcan be used as a pharmacophore; and converting the partial structureinto spheres representing pharmacophoric features, so as to generate thepharmacophore. The step of specifying a partial structure that can beused as a pharmacophore is conducted according to the means explained in“3-1.” The step of converting the partial structures into spheresrepresenting pharmacophoric features, so as to generate a pharmacophore,can be implemented using commercial software such as Catalyst and acomputer system on which the software can run.

By setting relative positional relationships on the three-dimensionalspace of spheres representing pharmacophoric features, a search formula(pharmacophore) on the software program Catalyst™ can be constructed. Inaddition, a specification method may be conducted using coordinates (x,y, z) or using the collection of slant distances connecting each point.Moreover, when an actual distance that enables interaction with areceptor, calculation errors, and the like are taken into consideration,it is not necessary for each distance between chemical functions to bealways precisely defined. According to Catalyst™, positions of eachchemical function are generally defined by a radius of a sphere rangingfrom 1.5 Å to 2.0 Å when each point is a center, or a direct distancebetween ±3 Å and 4 Å connecting each point. These numerical values canalso be changed as appropriate.

In the method for designing a pharmacophore using the structurecoordinates of the EGF-EGFR complex of the present invention, as thestructure coordinates of the EGF-EGFR complex, the structure coordinatesof EGF-EGFR binding sites or EGFR dimerization sites identified by themethod described in “3-1” can be used. Examples of the structurecoordinates of the EGF-EGFR binding sites or the EGFR dimerization sitesinclude the structure coordinates (ya) to (yh) and (yj) to (yp) that areeach explained in section “2. Structure coordinates of EGF-EGFRcomplex.” Furthermore, through the use of structure coordinates of aligand-receptor complex obtained using the structure coordinates of theEGF-EGFR complex by the homology modeling method or the molecularreplacement method as described below, pharmacophores useful inscreening for the receptor agonist and antagonist can be designed by asimilar method.

Furthermore, based on compounds showing EGFR-activity-regulating actionobtained by the above screening method using a computer or the aboveexperimental screening method, a pharmacophore can be defined byexpressing common structural characteristics among these compounds, andthen determining the relative distance between spheres representingpharmacophores. It can also be defined by determining the relativedistance between specific functional groups.

Moreover, when sites binding to a protein differ depending on compounds,there may be no physicochemical properties that are common to allcompounds. In this case, there is a need to perform appropriateclustering of compounds, and then to determine common physicochemicalproperties within each cluster.

We have succeeded in designing a pharmacophore of a substance regulatingEGFR activity utilizing the structure coordinates of the EGF-EGFRcomplex. A specific pharmacophore is disclosed in Example 5 and Example6.

3-4-2 Screening Method Using Pharmacophores

The present invention provides a method for screening for a substanceregulating EGFR activity using a pharmacophore. The use of a specifiedpharmacophore makes it possible to discover an EGFR agonist or an EGFRantagonist. Here, a case using Catalyst™ (Accelrys Inc.) as software foranalysis is described. However, any software may be used, as long as itenables extraction of chemical functions from a ligand and searching fora compound that can have spatial arrangement analogous to that of theextracted chemical functions. For example, Unity™, which is a module ofSYBYL™ (Tripos, Inc.), may also be used. At this time, when clusteringof compounds is required, a program such as Daylight Clustering Package™(Daylight Chemical Information Systems, Inc., Aliso Viejo, Calif.) canbe used.

Using the designed pharmacophore, a previously prepared compoundstructure database is used for computer screening. The pharmacophoreinformation and spatial arrangement of the three-dimensional structureof a compound are compared, and then whether or not the compoundsatisfies the properties of the pharmacophore is determined bycalculation. An advantage of computer screening is that a search objectis a partial collection of all the theoretically conceivable compounds.In general, a corporation's own compound database or a commercialcompound database, such as Available Chemical Directory (MDL InformationSystems, Inc.), databases produced by each compound sales companies oragents, and a database of virtual compounds generated using virtualcombinatorial synthesis techniques, a database of compounds derived fromnatural products, a drug database, is converted for use in computersearch.

A group of hit compounds selected by the above computer screening maybind to the pharmacophore of EGFR, so that they may be EGFR agonists orantagonists (substances regulating EGFR activity) at a certainprobability.

The screening method using a pharmacophore can comprise the steps of (1)generating structure coordinates of a three-dimensional structure of atest substance, and (2) superimposing the structure coordinates of (1)onto structure coordinates of spheres specifying a pharmacophore in thesame coordinate system, so as to evaluate their fitting state. Here, apharmacophore indicates the pharmacophore of the substance regulatingEGFR activity designed in “3-4-1.” At this time, preferably, thestructure coordinates of (1) match the relative arrangement and featuresof at least 3 or more spheres representing pharmacophoric features.These steps can be implemented using the above commercial software and acomputer system on which the software can run.

In addition, the database of a group of compounds selected by means ofsearch formulae is generated, and then the protein structure is appliedto commercial docking software so as to be able to increase theselection rate.

In addition, compounds that bind to the EGF-EGFR complex can be used asfilters effective for computer screening using DOCK or the like, and canalso be subjected De Novo design, which involves arranging fragmentsfitting each pharmacophore, and then binding each fragment at anappropriate functional group so as to construct a compound.

3-4-3 Design Method Using Pharmacophores

The use of the structure coordinates of the EGF-EGFR complex of thepresent invention enables molecular design (e.g., increased activity andprovision of selectivity) of substances regulating EGFR activity(agonists and antagonists).

Construction of a binding model of a compound selected in screening or aderivative thereof enables the use of the model for optimizing inductionof the thus obtained compound. For example, a docking model structure ofan EGFR agonist or an EGFR antagonist obtained by screening to EGF orEGFR can also be predicted on a computer. This model structure is veryuseful for discovering orientation for induction that enhancesinteraction between a compound and an amino acid residue adjacentthereto, because it can present an appropriate orientation for improvingmetabolism, toxicity, and the like without affecting the activity.Furthermore, the model structure also provides the most usefulinformation for considering differences among species in pharmacologicalexperiments or metabolism experiments or supporting induction andsynthesis for improving selectivity against a target protein for thepurpose of side-effect alleviation.

The method for designing a substance regulating EGFR activity using apharmacophore can comprises the steps of: (1) generating each group offragments having functional groups corresponding to each sphererepresenting pharmacophoric features; and (2) binding each fragmentselected one-by-one from each group of the fragments generated in (1),so as to design a compound. Furthermore, the method may also comprisesthe steps of: (3) superimposing structure coordinates of the designedcompound onto the structure coordinates or a pharmacophore of theEGF-EGFR complex in the same coordinate system so as to evaluate theirfitting state; (4) substituting one or more fragments of the compoundwith a fragment(s) having a property of enhancing or attenuatinginteraction with amino acid residues adjacent thereto; and (5) repeatingsteps (3) and (4).

Step (1) is for generating a group of fragments having functional groupscorresponding to each sphere representing pharmacophoric features. Inthe present invention, a fragment means a partial structure of acompound. For example, fragments having functional groups correspondingto spheres representing features of one pharmacophore are collected as agroup of fragments. Furthermore, fragments having functional groupscorresponding to spheres representing features of another pharmacophoreare collected as a group of fragments. In this manner, each fragmentgroup having functional groups corresponding to each condition isgenerated. At this time, fragments having functional groups that match aplurality of pharmacophores, or fragments simultaneously havingfunctional groups matching one pharmacophore and functional groupsmatching another pharmacophore, may also be collected. A manner ofcollecting fragments is not specifically limited. For example, fragmentsmay be collected from substances regulating EGFR activity. A newfragment may also be generated by a procedure such as addition of asubstituent to one collected fragment, extension or reduction of carbonchains, substitution of an atom, or the like. At this time, it is moreuseful to generate fragments such that a known substance regulating EGFRactivity is improved. For example, water molecules (structural water)existing between EGFR and a substance regulating EGFR activity may alsoplay a role in the formation of a complex of the two, and theEGFR-substance regulating EGFR activity interaction mediated by suchmolecules can be specified by graphics observation or the like.Furthermore, such interaction can be observed by analyzing amino acidresidues and water molecules that can participate in interaction, inparticular, sites or amino acid residues forming hydrophobicinteraction, ionic bonds, or hydrogen bonds, and furthermore, amolecular shape given by the active conformation of a substanceregulating EGFR activity in the structure of the complex. In thismanner, it is possible to create new fragments in order to discover theorientation for induction that enhances interaction between a knownsubstance regulating EGFR activity and amino acid residues adjacentthereto; or in order to present orientation appropriate for proceedingwith pharmacological improvement in such areas as metabolism andtoxicity without affecting the activity.

Next, a compound model is built by binding fragments selected one-by-onefrom the above-generated fragment group. For example, on paper,fragments selected one by one from each fragment group determined instep (1) may be bound to each other so as to build a compound model.Preferably, a compound model is built using computer software. Softwareused herein may be any software, as long as it can build the structureof a compound. For example, the program Ludi™ (Accelrys Inc.),CombiLibMaker™, which is a module of Sybyl™ (Tripos, Inc.), or the likecan be used.

Binding of fragments includes direct binding of fragments to each other,and binding via a linker that is used between fragments. Examples of alinker to be used herein are not specifically limited, and include agroup of hydrocarbon having straight chains or side chains, and groupswhere the above group of hydrocarbons is substituted with a heterocompound.

Furthermore, by designing a peptide having amino acid residues existingin a continuous sequence in EGF or EGFR and participating ininteraction, a low molecular weight compound can also be mimicked basedon the structure information on the EGF-EGFR crystal. For example, apartial sequence such as Cys33-Trp49 of EGF (CVVGYIGERCQYRDLKW: SEQ IDNO: 10), Ser11-Asn32 in domain I of EGFR(SNKLTQLGTFEDHFLSLQRMFN: SEQ IDNO: 11), Pro241-Thr266 in domain II of EGFR (PPLMLYNPTTYQMDVNPEGKYSFGAT:SEQ ID NO: 12), or Leu345-Leu363 in domain III (LHILPVAFRGDSFTHTPPL: SEQID NO: 13) can exert action to regulate activity, such as agonistactivity or antagonist activity. Thus, a peptide mimic of the sequenceis created to degrade the lower molecular weight thereof, so that it canbe optimized as a medicament.

The methods explained in “3-1” to “3-4” can be used not onlyindividually, but also in combination or repeatedly. For example, afterscreening for a compound using the structure coordinates of the EGF-EGFRcomplex is conducted, the selected compound can further be subjected tothe screening method using a pharmacophore. Combination or repetition ofa plurality of techniques makes it possible to identify a more precisesubstance regulating EGFR activity.

3-5 Evaluation by Biochemical Assay

The above-explained method for designing or screening for a substanceregulating EGFR activity using the structure coordinates of the EGF-EGFRcomplex, or the above explained method for designing or screening for asubstance regulating EGFR activity using a pharmacophore enables rapidscreening on a computer. However, although a compound group selected byscreening utilizing a computer has expected activity at a higherprobability, all compounds do not always have such activity. Thus, it ispreferable to evaluate many compounds experimentally (using biochemicalassay). Specifically, the screening or the design method using structureinformation and pharmacophores can be said to be a method for screeningfor a “candidate” substance regulating EGFR activity when it does notcontain a step of subjecting compounds to biochemical assay.

Hence, when a compound to be experimentally evaluated based on theresults of computer screening is selected, it is required to determinethe number of compounds to be actually evaluated experimentallyconsidering the number of active compounds that are expected as a resultof evaluation.

In general, a program for conducting computer screening contains anevaluation system. However, such an evaluation system often involvesoriginal procedures tailored to the algorithm of the correspondingprogram. When the activity value of each compound is obtained by meansof an evaluation system, compounds to be subjected to experimentalevaluation can be selected based on the activity values. However, thereare many evaluation systems by which activity values are not obtainedand only empirical numerical values are obtained. In the meantime, thepurpose of computer screening is to narrow down the number of compoundsto be subjected to experimental evaluation, and it is meaningful toselect top compounds ranked by an evaluation system in numbers thatenable experimental evaluation. For example, when a probability that acompound selected by computer screening has expected activity issupposed to be between 5% and 30%, approximately 30 to 200 candidatecompounds are selected to obtain 10 compounds that are substancescontrolling activity. Furthermore, approximately 160 to 1000 candidatecompounds are selected to obtain 50 compounds that are substancescontrolling activity. At this time, to confer diversity on compoundscontrolling activity that are finally obtained, it is also meaningful toconduct clustering of top compounds ranked by evaluation based onsimilarities in structure, physical property, and the like, and then toselect from each cluster the number of compounds to be subjected toexperimental evaluation.

Specifically, further subjecting of (candidate) substances regulatingEGFR activity selected by screening on a computer to biochemical assayusing cells expressing EGFR or EGFR makes it possible to moreeffectively select a substance regulating EGFR activity. Whether or nota test substance exerts action regulating EGFR activity upon the use ofbiochemical assay can be determined by examining if there is adifference in protein activity between a case where the compound hasbeen added to a system by which protein activity can be confirmed and acase where no compounds have been added. Having action regulatingactivity indicates that there is a difference between assayed proteinactivity values of a group to which a test compound has been added andthose of a group to which no test compounds have been added. Forexample, having action regulating activity means to have an inhibition(or suppression) rate or enhancement (or promotion) rate calculated bythe following formula of 10% or more, preferably 30% or more, morepreferably 50% or more, further more preferably 70% or more, andparticularly preferably 90% or more.

Inhibition (suppression) rate or enhancement (promotion) rate(%)=absolute value of (assayed value of the group to which no compoundshave been added−assayed value of a group to which a test compound hasbeen added)/assayed value of the group to which no compounds have beenadded×100

Here, whether action is inhibition action or enhancement action andassayed values can be appropriately determined based on a system type bywhich protein activity can be confirmed. For example, when a system bywhich protein activity can be confirmed is the method of biochemicalassay example 6 shown below, absorbance can be used. In a case where theassayed values of a group to which a test substance has been added areless than the assayed values of a group to which no test substances havebeen added, the test substance can be said to be an EGFR antagonist. Ina case where assayed values of a group to which a test substance hasbeen added are greater than the assayed values of a group to which notest substances have been added, the test substance can be said to be anEGFR agonist. When a measurement system contains background or noisevalues, an assayed value can be obtained by subtracting such backgroundor noise values from the original assayed value.

Examples of biochemical assay are as shown below, but are not limitedthereto.

BIOCHEMICAL ASSAY EXAMPLE 1 EGF Receptor Binding Assay

Cells expressing EGF receptors at a high level represented by A431(human squamous cell carcinoma) cells and the like or soluble EGFreceptors are immobilized on a plate. EGF labeled with europium and atest substance are added to each well, followed by incubation for agiven time period. After incubation, the plate is washed, a DELFIAenhancement reagent is added, and then time-resolved fluorescence ismeasured. Fluorescence count of a well to which DMSO has been addedinstead of the test substance is used as a control. A test substanceshowing a count lower than the control count can be screened for as asubstance inhibiting EGF binding.

Similarly, receptor binding assay can also be conducted using EGFlabeled with RI or the like.

BIOCHEMICAL ASSAY EXAMPLE 2 ELISA Assay

Cells expressing EGF receptors at a high level including A431 cells orsoluble EGF receptors are immobilized on a plate. EGF and a testsubstance are added to each well, followed by incubation for a giventime period. After incubation, the plate is washed, anti-EGF antibodieslabeled with HRP are added, and then incubation is further conducted.The plate is washed again, and then a substrate solution is addedthereto, thereby initiating an enzyme reaction. After a given timeperiod, the reaction is stopped, and then absorbance is measured. Usingthe absorbance of a well to which DMSO has been added instead of thetest substance as a control, a test substance showing a count lower thanthat of the control can be screened for as a substance inhibiting thebinding of EGF.

BIOCHEMICAL ASSAY EXAMPLE 3 Binding Experiment Using BIACOR®

Soluble EGF receptors are immobilized on a sensor chip, and then testsubstances are introduced onto the chip through the BIACORE® (registeredmark) microchannel system. Changes in quantity on the sensor chipsurface are detected by surface plasmon resonance, so as to be able toscreen for test substances specifically binding to the EGF receptor.

BIOCHEMICAL ASSAY EXAMPLE 4 Phosphorylation Experiment for EGF Receptor(Western Blotting)

Cells expressing EGF receptors at a high level including A431 cells areinoculated on a plate. EGF and test substances are added to each well,followed by incubation for a given time period. After incubation, thecells are lysed using a Laemmli buffer (Laemmli, U.K., 1970, NATURE 227:680-685). Protein is separated by SDS-PAGE, and then blotted onto amembrane. Tyrosine-phosphorylated EGF receptors are caused to emit usinga secondary antibody labeled with an anti-phosphorylated EGF receptorantibody, HRP, or the like, and an appropriate substrate solution, sothat they are detected on X-ray film or the like. The quantity (darknessof a band detected) of the phosphorylated receptors in cells to whichDMSO has been added instead of the test substances is used as a control,so that test substances showing bands which are lighter in color thanthat of the control can be screened for as agents inhibiting theactivation of EGF receptor.

Furthermore, a test substance showing a band which is darker in colorthan that of the control can be screened for as an agent activating anEGF receptor.

BIOCHEMICAL ASSAY EXAMPLE 5 Reporter Gene Assay

A plasmid is constructed wherein a luciferase gene is ligated downstreamof the promoter region of a gene such as c-fos or c-myc whosetranscription is induced by EGF stimulation. The plasmid is transfectedinto cells expressing EGF receptors represented by A431 cells or thelike, thereby producing recombinant cells expressing a reporter genetransiently or permanently by EGF stimulation. EGF and test substancesare added to the recombinant cells, and then luciferase activity isassayed after a given time period. The luciferase activity of a well towhich DMSO has been added instead of the test substances is used as acontrol, so that test substances showing luciferase activity lower thanthat of the control can be screened for as agents inhibiting theactivation of EGF receptor.

Furthermore, a test substance showing luciferase activity that is higherthan that of the control can also be screened for as an agent activatingan EGF receptor.

BIOCHEMICAL ASSAY EXAMPLE 6 Cell Proliferation Assay

Cells showing EGF-dependent proliferation represented by A431 cells,SiHa cells (human squamous cell carcinoma), and the like are inoculatedon a plate. EGF and test substances are added. A few days later, viablecount is quantified by an MTT method or the like. Absorbance of a wellto which DMSO has been added instead of the test substances is used as acontrol, so that test substances showing absorbances lower than that ofthe control can be screened for as agents inhibiting the activation ofEGF receptor. Furthermore, a test substance showing absorbances higherthan that of the control can be screened for as an agent activating anEGF receptor.

The effect and the action of EGF have been shown to be induced via EGFreceptors existing on cell membranes. It is expected that a compoundbinding to an EGF receptor so as to activate it has physiologicalactivity and pharmacological activity equivalent to those of EGF.Evaluation examples of the biological activity (drug efficacyexperiment) of an agent activating an EGF receptor are shown below.

EXPERIMENT EXAMPLE 1 Action on Tumor Transplantation Model Mouse

Colon-26 (carcinoma of mouse colonic gland) or A431 cells aretransplanted subcutaneously in the abdomens of BALB/c female mice. Aftera given time period, carcinoma size, that is, the major axis (“a” mm)and the minor axis (“b” mm) of each tumor, are measured, and then thecarcinoma weight is calculated by the following formula.Carcinoma weight (mg)=ab ²/2(Since these cancerous tumors generally exist in the form of spheroids,a formula for calculating the volume of a spheroid is employed. Inaddition, volume=weight is employed, wherein the specific gravity of acancer cell is supposed to be approximately 1.)

After carcinoma weight is weighed immediately before the administrationof a test substance, the test substance is administered to each group,and then changes with time in the weight of transplanted carcinoma cellsare measured. The agent activating an EGF receptor obtained by thepresent invention increases or decreases carcinoma weight.

EXPERIMENT EXAMPLE 2 Action on Acute Hepatic Disorder Model Mouse

A test substance is administered to ddY male mice. 30 minutes later, 3vol % carbon tetrachloride is injected subcutaneously. At 24 hours afterthis time blood is collected from orbital plexus, and then serum isprepared. The serum is diluted, and then GOT and GPT values aremeasured. The agent activating an EGF receptor obtained by the presentinvention suppresses increases in GOT and GPT values.

EXPERIMENT EXAMPLE 3 Action on Gastric Ulcer Model Rat

SD male rats are fasted for 24 hours before the start of the experiment,and then incised along a median line under anesthesia so as to cut openeach stomach thereof. The outside and the inside of each stomach wallare caught with tweezers with a ring, and then 60 vol % acetic acid isadded dropwise to the inside. After a given time period, the tweezersare removed, and then the cut stomach is sutured. The rats are then fedad libitum. On the next day, administration of a test substance isbegun, and changes in the thickness of the mucosal layer are measuredwith time. The agent activating an EGF receptor obtained by the presentinvention increases the thickness of the mucosal layer.

EXPERIMENT EXAMPLE 4 Action on Granulation

Paper discs allowed to absorb a test substance are transplantedsubcutaneously to the dorsal regions of Wistar male rats underanesthesia. Changes in dry weight, protein content, DNA content and/orhydroxyproline content of the thus formed granulation tissues aremeasured with time. The agent activating an EGF receptor obtained by thepresent invention increases the dry weight, the protein content, the DNAcontent and/or the hydroxyproline content.

EXPERIMENT EXAMPLE 5 Action on Parkinson Disease Model Rat

Parkinson disease model rats are produced by administering6-hydroxydopamine HBr to the nigrostriatal dopamine pathway (on oneside) of each SD female rat. The ventral midbrain obtained from a14-week-old rat fetus and a test substance are transplanted into thestriatum, amphetamine is administered, and then changes in mobility areobserved with time. The agent activating an EGF receptor obtained by thepresent invention decreases mobility when amphetamine is administered.

Moreover, the biological activity of the agent inhibiting the activationof EGF receptor can be confirmed by, for example, various methods shownbelow.

EXPERIMENT EXAMPLE 6 Action on Tumor Transplantation Model Mouse

This experiment is conducted by procedures similar to those in theexperiment example 1, except that EGF is administered together with atest substance, and biological activity of this group is compared withthat of a group to which only EGF has been administered. The agentinhibiting the activation of EGF obtained by the present inventioninhibits changes in carcinoma weight that are observed when only EGF hasbeen administered.

EXPERIMENT EXAMPLE 7 Action on Rat Subjected to Oophorectomy

After ovaries have been resected from SD female rats, a test substanceis administered. After a given time period, both femora are resected,and calcium and hydroxyproline levels contained in bone trabeculae aremeasured. The agent inhibiting the activation of EGF receptor obtainedby the present invention suppresses decreases in calcium andhydroxyproline contents.

Furthermore, the biological activity of an EGF-receptor-selective-drugdelivery agent can be confirmed by, for example, the following method.

EXPERIMENT EXAMPLE 8 Action on Tumor Transplantation Model Mouse

This experiment is conducted by procedures similar to those in theexperiment example 1, except that an appropriate cytotoxic substance isadministered together with a test substance, and then biologicalactivity of this group is compared with that of a group to which onlythe cytotoxic substance has been administered. TheEGF-receptor-selective drug delivery agent obtained by the presentinvention enhances the action suppressing an increase in carcinomaweight that is observed when only the cytotoxic substance has beenadministered.

The physiological activity and the pharmacological activity of EGF havebeen reported as follows.

-   1) Promotion [Dev Biol., 12, 394 (1965), Science, 201, 515 (1978)]    or suppression [J. Biol. Chem., 259, 7761 (1984)] of the    proliferation of normal cells and tumor cells-   2) Promotion of self-regeneration of damaged organs [Ciba Found    Symp., 55, 95 (1977)]-   3) Action suppressing gastric acid secretion [Gut, 23, 951    (1982)]•Action protecting mucous membrane of digestive tract [J.    Clin. Gastroenterol., 13, S103 (1991)]-   4) Action promoting wound healing [J. Surg. Res., 33, 164    (1982)]•Action correcting cornea [Exp. Eye Res., 40, 47 (1985)]-   5) Action protecting nerves [J. Neurosurg. Sci., 37, 1    (1993)]•Action differentiating/proliferating neurons [J. Neurosci.,    12, 4565 (1992), J. Neurosci., 16, 2649 (1996)]-   6) Action promoting calcium liberation [Endocrinology, 107, 270    (1980)]

A substance (agonist) binding to an EGF receptor so as to activate thereceptor can be used as 1) an agent for regulating the proliferation oftumor cells and preferably an anti-tumor agent, or an agent promotingthe activation or the metabolism of cells, cosmetics, and preferably adepilatory; 2) an agent promoting the regeneration of damaged organs andpreferably a therapeutic agent against liver function failure; 3) anagent used against digestive tract dysfunction and preferably an agentsuppressing gastric acid secretion, an agent protecting the mucousmembranes of digestive tracts, and an antiulcer agent; 4) an agentpromoting wound healing, and preferably a therapeutic agent for skinulcer or damaged cornea due to diabetes mellitus, injury, burn, or thelike; and 5) an agent regenerating or protecting neurons and preferablyan anti-parkinsonism agent.

Furthermore, a substance (antagonist) binding to an EGF receptor, so asto inhibit the receptor activation can be used as: 1) an agentregulating the proliferation of normal cells and tumor cells, andpreferably an anti-tumor agent, a therapeutic agent for psoriasis, and atherapeutic agent for chronic obstructive respiratory disease includingasthma; and 2) an agent suppressing bone resorption, and preferably aprophylactic or a therapeutic agent for osteoporosis.

Furthermore, a compound binding to an EGF receptor can be used as anagent for selectively delivering a drug to cells having the EGF receptorand preferably an agent for selectively delivering a cytotoxic substanceas a drug to tumor cells.

The present invention also provides a substance regulating EGFR activitythat is identified or designed by the above-described method fordesigning or screening for a substance regulating EGFR activity usingthe structure coordinates of the EGF-EGFR complex, or the method fordesigning or screening for a substance regulating EGFR activity using apharmacophore.

3-6 EGFR Agonist or EGFR Antagonist

The present invention also provides an EGFR agonist or an EGFRantagonist having a structure that fits a pharmacophore defined byspecific spheres representing pharmacophoric features. “Having astructure that fits a pharmacophore” means that the structurecoordinates obtained when the three-dimensional structure of an EGFRagonist or an EGFR antagonist is generated match the relativearrangement and features of at least 3 spheres representingpharmacophoric features. Examples of atoms or functional groups that fiteach sphere representing pharmacophoric features are defined in a manualattached to a program such as Catalyst™ (Accelrys Inc., San Diego,Calif., Catalyst Documentation Release 4.5, 1999). An example of apharmacophore having specific spheres representing pharmacophoricfeatures is disclosed in Examples 5 and 6, but is not limited thereto.Pharmacophores obtained by the method for designing a pharmacophore ofthe present invention are also included herein.

3-7 Method for Regulating EGFR Activity

The present invention also provides a method for regulating EGFRactivity, which comprises bringing EGFR into contact with a substanceregulating EGFR activity that has a structure that fits a pharmacophoredefined by specific spheres representing pharmacophoric features. Thismethod may further comprise a step of confirming that EGFR activity hasbeen regulated. The step of confirming that EGFR activity has beenregulated can be realized by detecting that at least one phenomenon(e.g., phosphorylation of EGFR intracellular domains and cellproliferation) known to take place as a result of EGFR activation hasbeen inhibited or promoted.

4. Production of EGF Variant and EGFR Variant Using StructureInformation

The present invention provides a method for designing an EGFR variant,or a method for designing an EGF variant, which uses structurecoordinates of an EGF-EGFR complex. The present invention furtherprovides a method for producing an EGFR variant or an EGF variant usingthe design method and the EGFR variant or the EGF variant obtainable bythe production method.

The interaction sites and the dimerization sites among EGF/EGFR areobserved using the structure coordinates of the EGF-EGFR complex of thepresent invention, so that point mutation can be introduced depending onthe purpose into an amino acid residue in EGF or EGFR. The method fordesigning an EGFR variant or the method for designing an EGF variant,which use the structure coordinates of the EGF-EGFR complex, cancomprise the following steps of: entering the structure coordinates ofthe EGF-EGFR complex into a computer; specifying amino acid residuescomposing EGF-EGFR binding sites or EGFR dimerization sites by analyzingthe structure of the EGF-EGFR complex; and specifying amino acidresidues to which mutation is introduced.

Furthermore, the method may also comprise a step of displaying visuallythe three-dimensional structure of the EGF-EGFR complex on a computer.The step of specifying amino acid residues composing EGF-EGFR bindingsites or EGFR dimerization sites by analyzing the structure of theEGF-EGFR complex and the step of specifying amino acid residues to whichmutation is introduced can be realized by visual observation and/oranalysis using a computer program.

The method for producing an EGFR variant or an EGF variant using thedesign method can comprise the following steps of: designing an EGFRvariant or an EGF variant using the structure coordinates of theEGF-EGFR complex; preparing a variant protein; and subjecting thevariant protein to biochemical assay so as to confirm that the proteinhas desired activity.

For example, the method makes it possible to design an EGF varianthaving enhanced EGF activity. Compared with a simple recombinant EGF,such an EGF variant has merit in that it can be administered in lowdoses as a medicament for injury or the like.

Specifically, to design an EGF variant having activity as an agonist(agent), mutation is introduced into amino acid residues and the regionadjacent thereto of EGF participating in direct interaction with EGFR soas to be able to bind more strongly with an amino acid residue in theregion on the EGFR side corresponding upon interaction. Here, “theregion adjacent thereto” means a region participating in electrostaticinteraction, hydrophobic interaction, Van der Waals interaction,hydrogen bond formation, or the like with the amino acid residue, andspecifically a region that is located within 5 Å from the amino acidresidue. Furthermore, when a variant EGF having activity as an agonistis designed by introducing mutation into a site other than the aboveregions, such a design method is also encompassed in the scope of thepresent invention, as long as it uses the structure coordinates of thepresent invention.

The step of entering the structure coordinates of the EGF-EGFR complexinto a computer, the step of visually displaying the three-dimensionalstructure of the EGF-EGFR complex on the computer, and the step ofspecifying amino acid residues composing EGF-EGFR binding sites or EGFRdimerization sites by analyzing the structure of the EGF-EGFR complexare described in detail in “3-1” above. In the step of specifying aminoacid residues into which mutation is introduced, examples of interactionby noncovalent bonds that should be considered include electrostaticinteraction, hydrophobic interaction, Van der Waals interaction, andhydrogen bond formation. By comprehensively considering them, a finalvariant can be designed. For example, in the vicinity of amino acidresidues having negatively charged side chain portions, such as glutamicacid and aspartic acid on the EGFR side, mutation is introduced so thatpositively-charged side chains of amino acid residues such as lysine,arginine, and histidine are arranged as amino acid residues of EGFadjacent thereto. Conversely, in the vicinity of amino acid residueshaving positively charged side chain portions, such as lysine, arginine,and histidine, mutation is introduced so that negatively-charged sidechains of amino acid residues such as glutamic acid and aspartic acidare arranged as amino acid residues of EGF adjacent thereto. Moreover,for a portion where amino acid residues whose side chain portions havehigh hydrophobicity (e.g., alanine, leucine, isoleucine, valine,proline, phenylalanine, triptophan, and methionine) mainly clustertogether for interaction, positions at which hydrophilic amino acidresidues such as serine, threonine, tyrosine, asparagine, and glutamine,or charged amino acid residues such as aspartic acid, glutamic acid,lysine, arginine, and histidine, are present in EGF are found, the aminoacid residues are substituted with hydrophobic amino acid residues,thereby enhancing hydrophobic interaction. Furthermore, for the mainchain portion that forms hydrogen bonds, or side chain portions of aminoacid residues such as serine and tyrosine, amino acid residuescorresponding thereto are caused to mutate so as to be able to form newhydrogen bonds. In the above mutation, care should be exercised so thatVan der Waals interactions become as large as possible at the side chainand the main chain of amino acid residues, and steric constraints areprevented from being generated between atoms. Moreover, it is necessaryto prevent any new gap portions from being generated due to mutation.For a region wherein a gap portion is already present, it is alsonecessary to consider mutation that fills the gap portion as far aspossible. In this manner, electrostatic interaction, hydrophobicinteraction, Van der Waals interaction, hydrogen bond formation, andother factors are comprehensively taken into consideration visually on acomputer screen or/and using an appropriate computer program, so that afinal variant can be designed.

For example, according to the analytical results of the crystalstructure, Met21 of EGF can be substituted with an amino acid residuecapable of forming a hydrogen bond having carbons upto or beyond a deltacarbon with a certain degree of probability. In particular, substitutionof Met21 with Lys, Arg, Glu, or Gln can confer the ability of forminghydrogen bonds between the hydroxyl group of Tyr45 side chain and Leu14main chain carbonyl of Gln16 side chain. Thus, elevated activity ofbinding to EGFR, improved EGF physical properties, and the like can beexpected. On the contrary, substitution with amino acid side chainsextending to only Cγ can deteriorate activity (e.g., Thr and Ser).Furthermore, Gln43 can be substituted with a hydrophobic residue such asLeu and Ile with a certain degree of probability. This can be inferredfrom the fact that a residue of EGFR capable of conducting interactionis Leu.

Furthermore, to design an EGF variant having activity as an antagonist(antagonistic drug), mutation is introduced into amino acid residues ofEGF and the region adjacent thereto participating in direct interactionwith EGFR. Next, a variant characterized in that the binding of thevariant EGF to EGFR makes it impossible to keep the relative position ofthe 2 original molecules, EGF and EGFR, in the three-dimensional space;or a variant characterized in that the variant EGF and EGFR becomeunable to interact with each other so that the variant has activity asan antagonist against native EGF, is selected. Alternatively, a variantthat does not cause any structural changes in EGFR while binding to EGFRis also useful. There is a case wherein variant EGF having activity asan antagonist is designed by introducing mutation into a site other thanthose participating in direct interaction with EGFR, and such a designmethod is also encompassed in the scope of the present invention as longas it uses the structure coordinates according to the present invention.

Moreover, introduction of mutation that makes interaction difficult intoamino acid residues participating in EGFR dimerization also makes itpossible to design a variant EGFR that binds to EGF, but does notperform dimerization. If it is designed as a soluble variant EGFR, thesoluble variant EGFR is expected to exert a therapeutic effect againstcarcinoma, psoriasis, or the like by exerting EGFR antagonist activityin a manner similar to that of an EGF neutralizing antibody. Forexample, Gln252 is linked by a hydrogen bond to the Ala286 main chain,Tyr246 is linked by a hydrogen bond to the Cys283 main chain, and Asn86is linked by a hydrogen bond to the Thr249 side chain so as to stabilizethe dimer. Substitution of these side chains of amino acid de residueswith amino acids that are unable to form hydrogen bonds makes itpossible to construct a molecule capable of trapping EGF withoutcarrying out dimerization.

“Variant” means a protein having an amino acid sequence derived from theamino acid sequence of the original protein by substitution, addition,deletion, or modification of at least 1, for example, 1 or several (1 to10), amino acids. Examples of amino acid modification include, but arenot limited to, amidation, carboxylation, sulfation, halogenation,alkylation, glycosylation, phosphorylation, hydroxylation, and acylation(e.g., acetylation). Amino acids to be substituted or added may benative amino acids, non-native amino acids, or amino acid analogues.Native amino acids are preferred. “Native amino acids” mean L-isomers ofnative amino acids. Examples of native amino acids include glycine,alanine, valine, leucine, isoleucine, serine, methionine, threonine,phenylalanine, tyrosine, triptophan, cysteine, proline, histidine,aspartic acid, asparagine, glutamic acid, glutamine, γ-carboxyglutamicacid, arginine, ornithine, and lysine. Unless specified, all amino acidsin this specification are L-amino acids. “Non-native amino acids,” meanamino acids in a protein that are not generally found in the nature.Examples of non-native amino acids include norleucine,para-nitrophenylalanine, homophenylalanine, para-fluorophenylalanine,3-amino-2-benzil propionic acid, D- or L-homoarginine, andD-phenylalanine. “Amino acid analogue” means molecules that are notamino acids, but are analogous to amino acids in physical propertiesand/or functions. Examples of an amino acid analogue include ethionine,canavanine, and 2-methyl glutamine.

Structure coordinates to be used in the method for designing a variantof the present invention may be the structure coordinates shown in Table1 or Table 2, structure coordinates of a structure homologue newlyproduced by calculation or the like using a computer based on thoseshown in Table 1 or Table 2, or some of the structure coordinatesextracted from these structure coordinates. In the case of designing amedicament for humans, the use of structure coordinates of ahuman-derived protein is further preferred.

A variant designed according to the present invention can be prepared bymany methods. For example, based on the present invention, DNA encodinga variant designed based on the present invention can be obtained bychemically synthesizing an oligonucleotide corresponding to a variant,substituting a site of the oligonucleotide encoding corresponding aminoacid residues (which has been determined so that mutation is effectivetherefor) with a native oligonucleotide portion using asequence-specific oligonucleotide cleavage enzyme (restriction enzyme).The thus obtained variant DNA is incorporated into an appropriateexpression vector, the vector is ligated into an appropriate host, andthen the recombinant protein thereof is produced by the host, so thatthe above variant can be obtained.

Furthermore, for a variant designed by the design method of the presentinvention and prepared, it is preferable to confirm whether or not thevariant has desired activity by biochemical assay. An assay system thatcan be used is not specifically limited, and the previously described(3-5) biochemical assay example can be employed. Specifically, an aminoacid residue into which mutation should be introduced is determinedutilizing the structure coordinates of the present invention, producinga variant protein using gene engineering techniques, subjecting thevariant protein to biochemical assay, and then evaluating whether or notthe variant protein has desired activity. Specific examples of desiredactivity of EGF include EGFR-binding ability and/or ability to enhanceor attenuate EGFR dimerization-inducing ability. Specific examples ofdesired activity of EGFR include EGF-binding ability and/or ability toenhance or attenuate dimerization ability.

As described above, the use of the structure coordinates of the presentinvention makes it possible to produce a variant based on theoreticalanalysis within a three-dimensional space, which has been conducted on atrial and error basis under conditions lacking theoretical support forthe three-dimensional structure.

In the method for designing an EGF variant or an EGFR variant using thestructure coordinates of the EGF-EGFR complex of the present invention,structure coordinates described in section “2. Structure coordinates ofEGF-EGFR complex” can be used as structure coordinates. Moreover, by theuse of structure coordinates of a ligand-receptor complex obtained usingthe structure coordinates of the EGF-EGFR complex by a homology modelingmethod or a molecular replacement method described below, a variant ofthe ligand or a variant of the receptor can be designed by a similarmethod.

The present invention provides a variant that is obtainable by the abovemethod for designing and that for producing an EGF variant or an EGFRvariant.

Furthermore, the present invention provides the following variants (A)to (D) as specific examples of a variant:

-   (A) an EGFR variant having amino acid mutation at an EGFR    dimerization site;-   (B) an EGFR variant having amino acid mutation at an EGF-EGFR    binding site;-   (C) an EGFR variant having amino acid mutation at an EGFR    dimerization site and an EGF-EGFR binding site;-   (D) an EGF variant having amino acid mutation at an EGF-EGFR binding    site.

Preferred embodiments of (A) include the following (A-1) to (A-3):

-   (A-1) an EGFR variant having mutation in at least one amino acid    residue among amino acid residues composing an EGFR dimerization    site;-   (A-2) an EGFR variant having mutation in at least one amino acid    residue among amino acid residues composing an EGFR dimerization    site, and having attenuated EGFR dimerization activity; and-   (A-3) an EGFR variant having mutation in at least one amino acid    residue among amino acid residues composing an EGFR dimerization    site, and having enhanced EGFR dimerization activity;

Preferred embodiments of (B) include the following (B-1) to (B-3):

-   (B-1) an EGFR variant having mutation in at least one amino acid    residue among amino acid residues composing an EGF-EGFR binding    site;-   (B-2) an EGFR variant having mutation in at least one amino acid    residue among amino acid residues composing an EGF-EGFR binding    site, and having attenuated EGF-binding activity; and-   (B-3) an EGFR variant having mutation in at least one amino acid    residue among amino acid residues composing an EGF-EGFR binding    site, and having enhanced EGF-binding activity.

Preferred embodiments of (C) includes the following (C-1) to (C-2):

-   (C-1) an EGFR variant having mutation in at least one amino acid    residue among amino acid residues composing an EGFR dimerization    site and in at least one amino acid residue among amino acid    residues composing an EGF-EGFR-binding site;-   (C-2) an EGFR variant having mutation in at least one amino acid    residue among amino acid residues composing an EGFR dimerization    site and in at least one amino acid residue among amino acid    residues composing an EGF-EGFR-binding site, and having enhanced    EGF-binding activity and attenuated EGFR-dimerization activity.

As shown in Example 9, EGF and EGFR are thought to carry out EGF-EGFRbinding and EGFR dimerization by so-called “induced fit.” Hence, avariant having mutation at an EGFR dimerization site can generate achange in EGF-binding ability, and an EGFR variant having mutation at anEGF-EGFR-binding site can generate a change in EGFR dimerizationability. Therefore, possible examples of EGFR variants, in addition tothe above EGFR variants, have desired activity of any one of enhancedEGF-binding activity, attenuated EGF-binding activity, enhanced EGFRdimerization activity, attenuated EGFR dimerization activity, enhancedEGF-binding activity and enhanced EGFR dimerization activity, attenuatedEGF-binding activity and attenuated EGFR dimerization activity, enhancedEGF-binding activity and attenuated EGFR dimerization activity, andattenuated EGF-binding activity and enhanced EGFR dimerization activity.

Preferred embodiments of (D) include the following (D-1) to (D-3):

-   (D-1) an EGF variant having mutation in at least one amino acid    residue among amino acid residues composing an EGF-EGFR binding    site;-   (D-2) an EGF variant having mutation in at least one amino acid    residue among amino acid residues composing an EGF-EGFR binding    site, and having attenuated EGFR-binding activity;-   (D-3) an EGF variant having mutation in at least one amino acid    residue among amino acid residues composing an EGF-EGFR binding    site, and having enhanced EGFR-binding activity.

Amino acid residues composing an EGFR dimerization site and amino acidresidues composing an EGF-EGFR binding site of a variant in this sectionare defined according to the explanation given in section “2. Structurecoordinates of EGF-EGFR complex.” “Composing a dimerization site” canalso entail the formation of important interactions in dimerization.This also applies in the case of an EGF-EGFR binding site.

“Attenuated activity” means that the activity of a protein havingmutation is, for example, 0.8 times or less, preferably 0.5 times orless, and further preferably 0.3 times or less than the activity of aprotein having a native amino acid sequence that contains no mutation.“Enhanced activity” means that the activity of a protein having mutationis, for example, 1.2 times or more, preferably 1.5 times or more, andfurther preferably 2.0 times or more than the activity of a proteinhaving a native amino acid sequence that contains no mutation. Proteinactivity can be assayed using any method that persons skilled in the artcan generally employ. Examples of such assay include biochemical assayexplained in section “3-5” of “3. Method for screening or designingusing structure coordinates,” and methods described in Examples 7 to 9.

5. Molecular Replacement Method Using Structure Coordinates of EGP-EGFRComplex

The present invention provides a method for obtaining structurecoordinates of a protein or a protein complex with an unknown structureby a molecular replacement method, which uses structure coordinates ofan EGF-EGFR complex. More specifically, the method can comprise thesteps of: crystallizing a protein or a protein complex with an unknownstructure; generating an X-ray diffraction image from the crystallizedprotein or the protein complex with the unknown structure; and applyingat least some of the structure coordinates described in Table 1 or Table2 to the X-ray diffraction image so as to generate at least a partialthree-dimensional electron density map of the protein or the proteincomplex with the unknown structure. The molecular replacement method isa means for solving phase problems in X-ray crystal structure analysis.When an X-ray diffraction image of a native crystal (crystal containingno heavy metal atoms) of a protein or a protein complex with an unknownstructure has been obtained, by utilization of the structure informationon a protein or a protein complex whose structure has already beendetermined, the method enables rapid determination of structurecoordinates of the protein or the protein complex with the unknownstructure without using a heavy atom isomorphous replacement method(Blundell, T. L. and Johnson, L. N., (1976) PROTEIN CRYSTALLOGRAPHY, pp.443-464, Academic Press, New York).

The structure coordinates or some of the structure coordinates of theEGF-EGFR complex according to the present invention can be used in X-raycrystal structure analysis for a crystal containing whole or a portionof EGF and EGFR, a crystal obtained from another protein having an aminoacid sequence that shares significant homology with EGF or EGFR, acrystal obtained from another protein that is predicted to bestructurally analogous to EGF or EGFR, or the like. When the molecularreplacement method is conducted, for example, a program such as CNS™(Brunger, A. T., et al., “Crystallography & NMR System: A New SoftwareSuite for Macromolecular Structure Determination,” ACTA CRYSTALLOGR. D.BIOL. CRYSTALLOGR. 54(Pt. 5):905-921 (1998) (Accelrys, Inc., San Diego,Calif.) or AMORE™ (CCP4 (one of a program group of the CollaborativeComputational Project, Number 4; ACTA CRYSTALLOGR. D50, 670-673 (1994))can be utilized, and other programs may also be used.

Examples of crystals to which the molecular replacement method should beapplied using the structure coordinates of the EGF-EGFR complex of thepresent invention include a crystal containing whole or a portion of EGFand EGFR, a crystal of another protein having an amino acid sequencethat shares significant homology with EGF or EGFR, a crystal of anotherprotein predicted to be structurally analogous to EGF or EGFR, a crystalof a complex of a compound (e.g., agonist or antagonist) that binds toEGFR and EGFR, a crystal of a complex of a compound (e.g., antagonist)that binds to EGF and EGF, a crystal of a protein having amino acidresidues that share significant homology with those of EGF, and acrystal of a protein having amino acid residues that share significanthomology with those of EGFR. The method can also be applied to a crystalof an EGF variant, a crystal of an EGFR variant, and complexes thereof(including a ligand-receptor complex and a protein-compound complex).Here, significant homology generally means that 20% or more, andpreferably 30% or more, amino acids are in agreement between amino acidsequences compared. The molecular replacement method can be applied forstructural factors actually calculated from an X-ray diffraction imageof a target crystal so as to obtain a meaningful solution. Specifically,a method for structurally analyzing a crystal of an unknown substanceother than the above substances by the molecular replacement methodusing all or some of the structure coordinates of the EGF-EGFR complexaccording to the present invention is encompassed in the scope of thepresent invention, when a meaningful solution can be obtained by thismethod.

In the molecular replacement method using the structure coordinates ofthe EGF-EGFR complex of the present invention, the structure coordinatesexplained in section “2. Structure coordinates of EGF-EGFR complex” canbe used as structure coordinates. In addition, structure coordinates ofa protein or a protein complex obtained using the structure coordinatesof an EGF-EGFR complex by a homology modeling method described below canalso be used.

Structure coordinates of a protein or a protein complex, which are newlyobtained by the molecular replacement method of the present invention,are also encompassed in the scope of the present invention.

6. Homology Modeling Method Using Structure Coordinates of EGF-EGFRComplex

The present invention provides a method for obtaining structurecoordinates of a protein or a protein complex with an unknown structureby a homology modeling method using structure coordinates of an EGF-EGFRcomplex.

Homology modeling is a technique of predicting the unknown structure ofa protein (target) based on the known structure of a protein (template)having a sequence analogous thereto. In homology modeling, first, ananalogous sequence is searched and found from a structure database, andthen the sequences are aligned. Next, based on the sequence of thealigned template, the structure of a corresponding portion is selectedso as to construct a putative structure.

There are two methods of homology modeling. One is based on fragments,and the other one is based on restraints. The method based on fragmentsinvolves modeling by collecting fragments obtained from a protein with aknown structure utilizes the average structure of the fragments, andemploys other techniques such as loop modeling for portions that are notstructurally conserved. The method based on restraints involves modelingto match restraints (e.g., distance between α carbons, and dihedralangles of main chain side chain) representing structural features.Restraints are expressed by evaluation functions so as to conductminimization. A representative example of a program used in this case isSali's MODELLER of Rockfeller university, but examples are not limitedthereto.

There is a restraint that homology modeling cannot be applied in theabsence of approximately 20%, and preferably 30% or more, homology witha known protein. However, compared with other approaches, homologymodeling has advantages in that it can be applied to a large protein andin that the more improved the structure database, the more improved theprediction accuracy is, because homology modeling directly utilizes thedatabase content. Selection and alignment of templates have a largeeffect on accuracy for structure prediction.

The homology modeling method using the structure coordinates of theEGF-EGFR complex according to the present invention comprises some orall of the steps of: preparing an amino acid sequence of a protein withan unknown structure to be modeled and an amino acid sequence of aprotein to be used as a template for modeling; aligning both amino acidsequences; generating coordinates of a target protein using thestructure information of the template protein as a template; andverifying that there is no theoretical problem in the thus obtainedmodel structure.

It is very useful if the structure of the EGF-EGFR complex havingindustrially applicable accuracy is revealed, as this enables drugdesign by the analysis of the information. The crystal structure of theEGF-EGFR complex is not only useful in drug design of an EGFR agonist oran EGFR antagonist. The use of the crystal structure of the complex alsoenables prediction of three-dimensional structures of all receptorsassumed to have the same conformational folds based on the homology withEGFR with high accuracy. Examples of such a receptor in a state of beingliganded include an insulin receptor, an insulin-like growth factor-1receptor, and ErbB2, 3, and 4. Obtainment of the EGF-EGFR co-crystalstructure by the present invention has revealed that EGFR brings about adynamic conformational change as a result of binding with a ligand. Thepresent invention makes it possible to obtain useful information forstructure prediction of a protein having EGF/EGFR-like folds, which isfar better than the prior art. For example, general homology modelingusing the EGFR structure enables prediction with high accuracy of theactive conformation upon binding of an insulin receptor to insulin.Moreover, by specifying a site binding with insulin and designing amolecule complementary thereto, a low molecular weight compound havinglow molecular insulin-like activity can also be presented. Aninsulin-like compound that can be orally administered has been anxiouslydesired. The use of this technology enables the provision of informationthat is far more reasonable than that obtained by searches through atrial-and-error process. Similarly, prediction of the activeconformation is enabled for all proteins having EGFR-like folds. Thus,the present invention makes it possible to conduct analyses for thepurposes of useful scientific analyses and drug creation (e.g.,anticancer agents).

Information and programs required for each of the above steps ofcomparing information regarding an amino acid sequence of a targetprotein with that regarding an amino acid sequence of a templateprotein, specifying a partial sequence composing a domain, generatingcoordinates of a target protein as a template using the structureinformation of a template protein, verifying if there are no theoreticalproblems in the thus obtained model structure, and specifying amino acidresidues composing an interaction site can be obtained using a databasethat is opened to the public or using various commercialized programs.Known protein information necessary for specifying domains is availablefrom public databases, such as the Genbank protein database (Benson, D.A., et al., “GenBank,” NUCLEIC ACIDS RES. 30(1):17-20 (2002) or the PDBdatabase Berman, H. M. et. al., NUCLEIC ACIDS RES. 28:235-242 (2000). Inaddition, for amino acid sequence comparison, BLAST™ (Altschul, S. F.,J. MOL. EVOL. 36:290-300 (1993); Altschul, S. F. et al., J. MOL. BIOL.215:403-10 (1990)), multiple sequence alignment program ClustalW™(Thompson, J. D., Higgins, D. G., Gibson, T. J., Nucleic Acids Res.22:4673-4680 (1994)), or any other available programs may be used. For adomain search, the PROSITE database, which is a motif database (Falquet,L., et al., “The PROSITE database, its status in 2002,” NUCLEIC ACIDSRES. 30(1):235-238 (2002)), Pfam Sonnhammer, E. L. L., Eddy, S. R. andDurbin, R., Proteins, 28, pp. 405-420 (1997)), or the like can beutilized. Examples of a search program include HMMER™, which is ahomology search program using the hidden Markov model (Durbin, R., Eddy,S.R., Krogh, A., Mitchison. G., Biological Sequence Analysis:Probabilistic Models of Protein and Nucleic Acids, Cambridge UniversityPress 1999) and the Transmembrane prediction program tmap using weightmatrix (Persson, B, Argos, P., J. MOL. BIOL. 237:182-92 (1994)).Furthermore, as a protein modeling program, FAMS™ (Ogata, K. et. al., J.MOL. GRAPH. MODEL. 18:258-272 (2000)), Modeler™ (Accelrys Inc., SanDiego, Calif.), or Homology™ (Accelrys Inc., San Diego, Calif.), and thelike can be used. As a protein structure evaluation program,Profiles-3D™ (Accelrys Inc., San Diego, Calif.) can be used. As agraphics display program, InsightII™ (Accelrys Inc., San Diego, Calif.),SYBYL™ (Tripos, Inc. St. Louis, Mo.), or the like can be used. Thecontents of these databases or the programs may be improved in terms ofthe nature thereof or new programs and the like can be developed in thefuture. These databases and the programs can be utilized as long as theyhave functions necessary for the implementation of the presentinvention, and the examples are not limited to the above examples.

In the homology modeling method using the structure coordinates of theEGF-EGFR complex of the present invention, as the structure coordinates,those explained in section “2. Structure coordinates of EGF-EGFRcomplex” can be used. Moreover, structure coordinates of a protein or aprotein complex obtained using the structure coordinates of the EGF-EGFRcomplex by the above molecular replacement method can also be used.

Structure coordinates of a protein or a protein complex that are newlyobtained by the homology modeling method of the present invention arealso encompassed in the scope of the present invention.

The homology modeling method using the structure coordinates of theEGF-EGFR complex is explained specifically as follows. An amino acidsequence of a receptor having the same folds as those of EGFR, such as ahuman insulin receptor, an insulin-like growth factor-1 receptor, orErbB2, 3, or 4, is extracted from an existing amino acid sequencedatabase. This amino acid sequence is aligned with that of EGFR. As asequence alignment program or a multiple alignment program, FASTA™,BLAST™, ClustalW™ or the like can be used. Next, homology modeling canbe conducted according to a standard method using the EGF-EGFRco-crystal structure specified in the present invention or a partialstructure thereof as a template and existing package software (e.g.,Homology™ (Accelrys Inc.,) and FAMS (KITASATO UNIVERSITY)). Calculationfor molecular optimization of an initial structure is conducted usingthe force field of Discover™, Charm™, Amber™, or the like. Afterminimization, a putative structure of a receptor having the same folds(thought to have extremely high accuracy) as those of a liganded insulinreceptor, insulin-like growth factor-1 receptor, EGFR (e.g., ErbB2, 3,and 4) or the like can be successfully constructed. As an example, amodeling structure of human ErbB2 is shown in FIG. 12. Amino acidresidues important for binding with a ligand have been determined basedon the thus obtained modeling structure of each protein, and arepresented in FIG. 11. These structures are useful in pharmacophoreextraction of agonists or antagonists, computer screening, moleculardesign of agonists or antagonists (e.g., increased activity andprovision of selectivity), design of industrially useful alteredproteins, production of neutralizing antibodies and agonist antibodies,novel crystal structure analysis by the molecular replacement method,analysis of other variants, and the like. The thus specifiedpharmacophores can be utilized for screening for or development ofdrugs.

In addition, a binding model of a hit compound selected by screening ora derivative thereof may be built and then used for optimization ofinduction of the obtained compound.

7. Method for Designing Epitope Using Structure Coordinates of EGF-EGFRComplex and Antibody Production

Through the use of the structure coordinates of the EGF-EGFR complex ofthe present invention, an epitope for an antibody can be reasonablydesigned. This enables efficient and reasonable production ofneutralizing antibodies and agonist antibodies. For example, observationof the structure of the complex of the present invention has revealedthat Cys33-Trp49 (CVVGYIGERCQYRDLKW: SEQ ID NO: 10) of EGF is a veryuseful sequence for production of a human EGF neutralizing antibody. Ithas also revealed that Ser11-Asn32 (SNKLTQLGTFEDHFLSLQRMFN: SEQ ID NO:11) in domain I, Pro241-Thr266 (PPLMLYNPTTYQMDVNPEGKYSFGAT: SEQ ID NO:12) in domain II, and Leu345-Leu363 (LHILPVAFRGDSFTHTPPL: SEQ ID NO: 13)in domain III of EGFR are very useful in production of a neutralizingantibody or an agonist antibody.

The method for designing an epitope, which uses the structurecoordinates of the EGF-EGFR complex, can comprise the following stepsof: entering the structure coordinates of the EGF-EGFR complex into acomputer; and specifying a portion that can be used as an epitope byanalyzing the structure of the EGF-EGFR complex. The method may furthercomprise a step of visually displaying the three-dimensional structureof the EGF-EGFR complex on the computer. The step of specifying aportion that can be used as an epitope by analyzing the structure of theEGF-EGFR complex can be implemented by visual observation and/oranalysis using a computer program. Furthermore, the method for producingan anti-EGFR antibody or an anti-EGF antibody using the design methodcan comprise the following steps of: designing an epitope using thestructure coordinates of the EGF-EGFR complex; preparing an antigencontaining the designed epitope; and preparing an antibody thatrecognizes the designed epitope.

To design an epitope, the structure coordinates of the EGF-EGFR complexof the present invention are introduced into a computer program that cangraphically display molecular three-dimensional structures. Thethree-dimensional structure of the EGF-EGFR complex is visuallyobserved, or an appropriate computer program is used, so that a portionthat can be used as an epitope is specified. As a portion that can beused as an epitope, a portion being exposed on a molecular surface, or aportion that can come into contact with a solvent (e.g., water molecule)is preferred. Moreover, when a portion that creates a protein-proteininteraction site upon the formation of a complex and is exposed on themolecular surface upon the formation of a monomer, or a portion that cancome into contact with a solvent (e.g., water molecule) is used as anepitope, a neutralizing antibody that obstructs the formation of acomplex can be designed. In the mean time, an antibody that linksligand-receptor binding sites or an antibody that links dimerizationsites is inferred to function as an agonist antibody. Furthermore, todesign an antibody having neither agonist activity nor neutralizingactivity, an epitope having no effect on EGF-EGFR binding anddimerization can also be designed.

After the portion that can be used as an epitope is specified, a peptidehaving an amino acid sequence of the epitope is synthesized, and then anantibody is produced according to a standard method using the peptide asan antigen. Alternatively, after antibodies are produced using a wholeor a portion of a protein as an antigen, an antibody recognizing adesired epitope can be selected and obtained.

An antibody may be a monoclonal antibody or a polyclonal antibody. Amonoclonal antibody is preferred because of its high specificity.

The above monoclonal antibody can be prepared by preparing a hybridomaby fusing an antibody-producing cell obtained from an animal immunizedwith an antigen with a myeloma cell, and then selecting clones producingan antibody that specifically recognizes a desired epitope from theobtained hybridoma.

As a peptide to be used as an antigen for immunization of an animal, apeptide prepared by the recombinant DNA method or chemical synthesis ispreferred. Production of a monoclonal antibody is well known in the art.When briefly explained, the preparation involves administering a peptideas an antigen together with an adjuvant to mammals such as mice, rats,horses, monkeys, rabbits, goats, or sheep for immunization. Immunizationintervals are not specifically limited, and immunization is conducted atintervals of several days to several weeks. After final immunization,antibody-producing cells (e.g., splenocytes, lymph node cells, andperipheral blood cells) are collected. Next, the antibody-producingcells are fused with myeloma cells. As the myeloma cells to be fusedwith the antibody-producing cells, an established cell line that isderived from various animals including a mouse, a rat, a human, and thelike and is generally available for persons skilled in the art is used.A cell line used herein has drug resistance, is unable to survive in aselection medium (e.g., HAT medium) in an unfused state, and has aproperty of being capable of surviving only in a fused state. As myelomacells, various previously known cell lines, for example,P3(P3x63Ag8.653), P3x63Ag8U.1, and the like, can be appropriately used.

Cell fusion is conducted by bringing myeloma cells into contact withantibody-producing cells at a mixing ratio between 1:1 and 1:10 in thepresence of a fusion promoter in a medium for culturing animal cellssuch as MEM, DMEM, or RPMI-1640 media. To promote cell fusion,polyethylene glycol, polyvinyl alcohol, or the like with an averagemolecular weight between 1,000 and 6,000 can be used. In addition,antibody-producing cells can also be fused with myeloma cells using acommercial cell fuser utilizing electric stimulation.

Hybridomas are selected from cells that had been subjected to cellfusion treatment. An example of such a method is a method utilizingselective growth of cells in a selection medium.

Specifically, a cell suspension is diluted with an appropriate medium,and then inoculated on a microtiter plate. Selection media (e.g., HATmedia) are added to each well, and then the cells are cultured whileappropriately exchanging the selection media. As a result, cells thathave grown can be obtained as hybridomas.

Screening of hybridomas is conducted by a limiting dilution method, afluorescence excitation cell sorter method, or the like. Finally,monoclonal-antibody-producing hybridomas are obtained. When antibodypurification is required for a method for collecting antibodies,antibodies are purified by appropriate selection of or combination ofknown methods such as an ammonium sulfate precipitation method, ionexchange chromatography, and affinity chromatography.

To evaluate an antibody, it is preferred to evaluate whether or not ithas neutralizing activity or agonist activity, in addition to whether ornot it specifically recognizes a desired epitope, by biochemical assay.The form of biochemical assay to be employed herein is not limited. Forexample, the above-described example of biochemical assay can beemployed.

The method for designing an epitope of the present invention can also beapplied to, in addition to EGF and EGFR, a protein or a protein complexwhose structure has been determined by the homology modeling methodusing the structure coordinates of the EGF-EGFR complex, and a proteinor a protein complex whose structure has been determined by themolecular replacement method using the structure coordinates of theEGF-EGFR complex. In addition, an antibody that has been producedutilizing the method for designing an epitope of the present inventionis also encompassed in the scope of the present invention.

8. Peptide Fragment

The present invention provides a peptide (polypeptide) or a salt thereofcomprising all or some of the amino acid residues of a region forming anEGFR-dimerization site. A region at which EGFR proteins of a complexbind to each other to form a dimer (EGFR dimerization site) consists of,for example, the 240th to the 267th (SEQ ID NO: 14) amino acid residuesof an amino acid sequence (amino acid sequence shown in SEQ ID NO: 1) ofEGFR, and contains all or some of the amino acid residues of the region.

Furthermore, the present invention also provides a peptide or a saltthereof comprising all or some of the amino acid residues of a regionforming a binding site of EGF and EGFR.

Examples of the peptide of the present invention include Cys33-Trp49 ofEGF (SEQ ID NO: 10), which is a very useful sequence for producing ahuman EGF neutralizing antibody, and Ser11-Asn32 in domain I (SEQ ID NO:11), Pro241-Thr266 in domain II (SEQ ID NO: 12), and Leu345-Leu363 indomain III (SEQ ID NO: 13) of EGFR.

When the peptide of the present invention is chemically synthesized, itcan be synthesized by a standard means for peptide synthesis. Examplesof such a means include an azide method, an acid chloride method, anacid anhydride method, a mixed acid anhydride method, a DCC method, anactive ester method, a method using carboimidazole, and anoxidation-reduction method. In addition, either a solid-phase synthesisor a liquid-phase synthesis can be applied for the synthesis.

Specifically, a peptide is synthesized by condensing amino acids thatcan compose the peptide of the present invention, and then eliminatingprotecting groups when a product thereof has protecting groups. Anyknown techniques may be used as a condensation method or a method foreliminating protecting groups (e.g., Nobuo Izumiya et al., Basis andExperiment for Peptide Synthesis (Peptide go-sei no kiso to jikken),MARUZEN CO., LTD. (1975)). After reaction, the peptide of the presentinvention can be purified by a combination of general purificationmethods such as solvent extraction, distillation, column chromatography,liquid chromatography, and recrystallization. Furthermore, in thepresent invention, the peptide can also be synthesized using acommercial automated peptide synthesizer (e.g., a simultaneous multiplesolid-phase peptide synthesizer, PSSM-8, Shimadzu Corporation).

The peptide salt of the present invention is preferably aphysiologically acceptable acid addition salt or a basic salt. Examplesof an acid addition salt include a salt with an inorganic acid such ashydrochloric acid, phosphoric acid, hydrobromic acid, or sulfuric acid,or a salt with an organic acid such as acetic acid, formic acid,propionic acid, fumaric acid, maleic acid, succinic acid, tartaric acid,citric acid, malic acid, oxalic acid, benzoic acid, methanesulfonicacid, or benzenesulfonic acid. Examples of a basic salt include a saltwith an inorganic base such as sodium oxide, potassium hydroxide,ammonium hydroxide, or magnesium hydroxide, or a salt with an organicbase such as caffeine, piperidine, trimethylamine, or lysine. Salt canbe prepared using an appropriate acid such as hydrochloric acid, or anappropriate base such as sodium hydroxide.

Furthermore, the peptide of the present invention has a C-terminus thatis generally a carboxyl (—COOH) group or carboxylate (—COO⁻). TheC-terminus may be amide (—CONH₂) or ester (—COOR). Here, examples of Rin ester include C₁₋₁₂ alkyl, C₃₋₁₀ cycloalkyl, C₆₋₁₂ aryl, and C₇₋₁₂aralkyl.

Furthermore, the peptide of the present invention also contains analanine residue on the N-terminus whose amino group is protected with aprotecting group, a conjugated peptide such as a glycopeptide having asugar chain bound thereto, or the like.

BRIEF DESCRIPTION OF DRAWINGS

The patent or application file contains at least one drawing executed incolor. Copies of this patent or patent application publication withdrawings will be provided by the Office upon request and payment of thenecessary fee.

FIG. 1 a to 1 c show a crystal structure of a dimeric EGFR-EGF complexand a part of an electron density map generated at 3.5 Å resolution.FIG. 1 a shows a ribbon diagram oriented with two-fold vertical axis.Most of domain IV is disordered. FIG. 1 b shows the top view of FIG. 1a. A two-fold axis is mainly shown. FIG. 1 c is a stereo view of a2Fo-Fc electron density map of a portion of domain I.

FIG. 2 a to 2 b show interaction between EGFR and EGF. FIG. 2 a showsthe mapping of different interaction sites on the ribbon representationof EGFR and EGF. The three binding sites in the interface are outlined.FIG. 2 b is a stereo view of the interface at site 1. Only the sidechains of interacting residues are shown. Dotted lines denote hydrogenbonds or salt bridges.

FIGS. 3 a and 3 b show interaction between EGFR and EGF (continued fromFIGS. 2 a and 2 b). FIG. 3 a is a stereo view of the interface at site2. FIG. 3 b is a stereo view of the interface at site 3.

FIG. 4 a to 4 c show interaction between receptors in dimer interface.FIG. 4 a shows the outline of the binding region in the interface. Onlythe side chains of interacting residues are shown. FIG. 4 b is a stereoview of the interface. Dotted lines denote hydrogen bonds or saltbridges. FIG. 4 c is a stereo view of the interface viewed fromdifferent direction shown by the arrow in FIG. 4 a.

FIG. 5 a to 5 c shows potential models for EGF-dependent receptordimerization. FIG. 5 a shows comparison of the overall folding of domainI, II, and III of liganded EGFR with L1, S1, and L2 domains ofunliganded IGF-1R. FIG. 5 b shows a putative structure of unligandedEGFR in the form of a monomer. Only the side chains of residues that mayparticipate in ligand-dependent activation are shown. FIG. 5 c shows thestructure of the dimeric EGFR-EGF complex. Only the side chains shown inFIG. 5 b are shown.

FIG. 6 shows photographs showing the result of SDS-PAGE analysis madefor deglycosylated EGFR and the result of that for EGF-dependentdimerization by chemical cross-linking.

A: Purified dEGFR (2 mg/ml) was incubated at 37° C. together withendoglycosidase H (1 U/ml) and D (0.4 U/ml) in a 0.1M sodium acetatebuffer (pH 5.8). After 0 hours (lane 1), 24 hours (lane 2), 48 hours(lane 3), 96 hours (lane 4), and 120 hours (lane 5) of digestion,products were collected, and the digested products were analyzed by 8%SDS-PAGE. In addition, a reaction mixture to which no endoglycosidasehad been added (lane 6) or a reaction mixture to which no dEGFR had beenadded was incubated for 120 hours.

B: Purified dEGFR (0.8 mg/ml) was incubated in a 0.05M Hepes-NaOH buffer(pH 7.5) supplemented with EGF (lane 1) or without EGF (lane 2),followed by incubation with a chemical crosslinking agent BS3 (1 mM).

FIG. 7 shows a microphotograph of a dEGFR-EGF complex crystal. Thedimension of the crystal is approximately 1.0×0.2×0.2 mm.

FIG. 8 shows comparison of the three-dimensional structure of a humaninsulin-like growth factor-1 receptor monomer (PDB-ID: 1 IGR) and thatof EGF-bound EGFR.

FIG. 9 shows examples of amino acid residues of EGF-EGFR binding sitesthat can be utilized for pharmacophore construction.

FIG. 10 shows examples of amino acid residues of EGFR dimerization sitesthat can be utilized for pharmacophore construction.

FIG. 11 shows ligand-binding sites inferred based on the modelstructures of ErbB family members and insulin receptor (IR) insulin-likegrowth factor-1 receptor (IGF-1R) produced by the homology modelingmethod using the structure coordinates of the EGF-EGFR complex.

FIG. 12 shows the modeling structure of human ErbB2 produced by thehomology modeling method using the structure coordinates of the EGF-EGFRcomplex. Dark gray portions represent putative ligand-binding sites.

FIG. 13 shows an example of a pharmacophore constructed using thestructure coordinates of the EGF-EGFR complex.

FIG. 14 shows photographs showing the conolly surface images of EGF andEGFR (receptor).

FIG. 15 shows a pharmacophore example (hypothesis “a”) constructed usingthe structure coordinates of the EGF-EGFR complex.

FIG. 16 shows a pharmacophore example (hypothesis “b”) constructed usingthe structure coordinates of the EGF-EGFR complex.

FIG. 17 shows a pharmacophore example (hypothesis “c”) constructed usingthe structure coordinates of the EGF-EGFR complex.

FIG. 18 shows a pharmacophore example (hypothesis “d”) constructed usingthe structure coordinates of the EGF-EGFR complex.

FIG. 19 shows saturation curves upon binding of europium-labeled EGFwith EGFR on A431 cells.

FIG. 20 shows binding inhibition by EGFR antibody (Ab-3) against bindingof EGFR (A431 cell) with europium-labeled EGF (10 nM).

FIG. 21 shows a photograph showing inhibition activity of a testcompound on EGFR phosphorylation. Lane 1 denotes a molecular-weightmarker. Lane 2 denotes a sample to which no EGF has been added and thenelectrophoresed, lane 3 denotes a sample to which EGF has been added andthen electrophoresed, lane 4 denotes a sample to which EGF and2-[2-(3-ethyl-5,5,8,8-tetramethyl-5,6,7,8-tetrahydro-naphthalene-2-yl)-2-oxo-ethylsulfanyl]-nicotinicacid (100 μg/mL) have been added and then electrophoresed, lane 5denotes a sample to which EGF and10-[3-(4-methyl-piperazine-1-yl)-propyl]-2-trifluoromethyl-10H-phenothiazinedihydrochloride (10 μg/mL) have been added and then electrophoresed, andlane 6 denotes a sample to which EGF and8-hexylsulfanyl-3-methyl-7-propyl-3,7-dihydro-purine-2,6-dione (10μg/mL) have been added and then eletrophoresed.

FIG. 22 shows photographs showing EGF-dependent EGFR activation in CHOcells expressing variant EGFR.

-   A: EGF-dependent ERK phosphorylation is shown.-   B: Cell surface expression of variant EGFR is shown.-   C: EGF-induced autophosphorylation of EGFR is shown.-   D: Binding of ¹²⁵I-labeled human EGF (2 nM) to cells expressing    variant EGFR is shown. SD (n=3) is shown as an error bar.

BEST MODE OF CARRYING OUT THE INVENTION

The present invention will be hereafter described in detail by referringto examples, but the technical scope of the present invention is notlimited by these examples.

EXAMPLE 1 Expression of EGF and EGFR

(1) Expression of EGFR

Using a forward primer and a reverse primer shown below, the expressionplasmid pcDNA-sEGFR (amino acids 1 to 619 of SEQ ID NO: 1) was preparedby standard PCR protocols.

Forward primer (including a restriction enzyme cleavage site and an ATGinitiation codon)

Sequence: 5′-cgg gga gct agc atg cga ccc tcc ggg acg gcc ggg-3′ (SEQ IDNO: 8)

Reverse primer (including a thrombin cleavage site, a FLAG epitope, aTAG stop codon)

Sequence: 5′-gcgc ctt aag cta ctt gtc atc gtc gtc ctt gta gtc gga tccacg cgg aac cag gat ctt agg ccc att cgt tgg ac-3′ (SEQ ID NO: 9)

PCR was conducted for 25 cycles of reaction (94° C. for 30 seconds, 55°C. for 15 seconds, and 75° C. for 1 minute) using pcDNA-EGFR containingDNA encoding full-length human EGFR as a template and PwoDNA polymerase(Roche).

The amplified DNA was cloned into the pcDNA3.1/Zeo(+) (Invitrogen), theplasmid for mammalian expression, thereby preparing the plasmidpcDNA-sEGFR. The vector plasmid was transfected to Lec8 cells (ATCCCRL-1737) by the Lipofectamine method according to instructions of thekit (Gibco-BRL). Here, the Lec8 cells produce a protein having terminalsialic acid and galactose residue-deficient N-linked oligosaccharides.Stable transfectants were selected using Zeocin (Invitrogen) at aconcentration of 0.1 mg/ml, thereby isolating independent colonies.Subsequently, it was confirmed using the anti-FLAG M2 antibody (Sigma)that the thus obtained cells had expressed EGFR.

(2) Purification of EGFR

Using Dulbecco's modified Eagle's medium and Ham's F-12 medium (1:1)(DMEM/F12) supplemented with 10% calf serum (Gibco-BRL), 100 units/mlpenicillin, 100 μg/ml streptomycin and 20 mg/l proline, the Lec8transfectants were maintained (37° C., 5% CO₂ incubator). Forlarge-scale EGFR production, the Lec8 transfectants were inoculated inthe DMEM/F12 supplemented with 1% calf serum, 100 units/ml penicillin,100 μg/ml streptomycin and 20 mg/l proline, followed by proliferationfor 3 to 4 days. Subsequently, the culture supernatants were collected,and then centrifuged at 1500 rpm for 15 minutes. The cells werecollected from the resulting flasks by scraping, and then transferred toother flasks to conduct expansion culture. The overall process wasrepeated while maintaining the cells for 3 months. The culturesupernatants were collected, EDTA and PMSF were added at a finalconcentration of 0.4 mM, respectively. Serum globulin was removed byammonium sulfate precipitation at 4° C. (280 g of ammonium sulfate wasadded per 1000 mL of a medium), and then 280 g of ammonium sulfate wasfurther added per 1000 mL of a medium, thereby precipitating EGFR. Thepellets were dissolved in 0.02 M phosphate buffer (pH7.1) (buffer A) andthen dialyzed. The thus obtained EGFR-containing sample was stored at−80° C. The next step was conducted at 4° C. Next, the sample was loadedon tandemly connected, buffer A-equilibrated columns of DEAE-Toyopearl650S (TOSOH CORPORATION), CM-Toyopearl 650S (TOSOH CORPORATION), andAffi-Gel Blue Gel (Bio-Rad). The protein was eluted from thedisconnected Affi-Gel Blue column with a linear gradient of 0.02-0.6 MNaCl in buffer A. The eluate was applied to an anti-FLAG M2 affinity gel(Sigma) column equilibrated with buffer A with an additional 0.15M NaCland circulated overnight. Proteins bound to the column were recoveredwith 0.1 M glycine-HCl buffer (pH3.5), and were neutralized immediately.The thus obtained sample was condensed with a UK-10 membrane (TOSOHCORPORATION), dialyzed against 0.02 M Tris-HCl (pH8.0) and 0.02 M NaCl(buffer B), and then loaded onto a Mono-Q column (Pharmacia)equilibrated with buffer B. Protein was eluted with a linear gradient of0.02-0.2 M NaCl in buffer B. The thus obtained EGFR was condensed withCentricon-10 (Amicon), and the enzymatically deglycosylated as describedbelow. Using 0.4 unit/ml endoglycosidase D (SEIKAGAKU CORPORATION) and 1unit/ml endoglycosidase H (Roche) in a 0.1 M sodium acetate buffer (pH5.3), 2 mg/ml EGFR, the concentration of which was determined using theextinction coefficient (0.65) at 280 nm, was incubated (under a nitrogenor an argon gas atmosphere at 37° C. for 50 hours). After the reactionmixture was diluted 10 fold with buffer B, deglycosylated EGFR (dEGFR)was purified on the Mono-Q column (purification was conducted similarlyto the above procedure employed for the Mono-Q column).

For the expression of EGFR having selenomethionine instead ofmethionine, DMEM/F12 (SeMet medium) containing 50 ml of freshselenomethionine instead of methionine, 1% dialyzed calf serum, 100units/ml penicillin, 100 μg/ml streptomycin, and 40 mg/l proline wasprepared. For the production of selenomethionyl EGFR, cells were grownto achieve a confluent culture in a manner similar to the above,followed by exchange of the medium to SeMet medium. After a 12 to 24hour preincubation, the cells were cultured in fresh SeMet medium for 2to 3 days, and then the medium was harvested. Selenomethionyl dEGFR waspurified in the same way as for the native dEGFR as described above.

(3) Results

Extracellular domains of EGFR had already been purified with aconditioned medium from an A431 cancer cell line, a recombinant insectcell line expressing EGFR, or a recombinant Chinese hamster ovary cellexpressing EGFR. For crystallization, we have developed a purificationmethod for isolating extracellular domains of EGFR produced byrecombinant Lec8 cells.

First, most of contaminants such as a serum protein were removed byammonium sulfate fractionation. Next, the remaining contaminants wereremoved by affinity chromatography using FLAG tag. The heterogeneity ofthe sample was decreased by anion exchange chromatography. Usingundeglycosylated purified protein, small crystals were grown, but noX-ray diffraction images were obtained. To further decrease theheterogeneity of the molecular surfaces, purified proteins were stronglydeglycosylated using endoglycosidase H and endoglycosidase D. Anionexchange chromatography was then conducted to remove enzymes anddigested glycan. As a result, when determined by SDS-PAGE, thedeglycosylated protein had shifted to a size of approximately 74 kDa asexpected (FIG. 6). Since it is unable to digest FLAG tag using thrombin,FLAG-tagged dEGFR was used for crystallization and other experiments.The yield of dEGFR was approximately 1 mg per 3.5 l of the collectedmedium.

EXAMPLE 2 Crystallization of Dimeric EGF-EGFR Complex

(1) Crystallization

The dEGFR-EGF complex for crystallization was prepared as follows. Thepurified dEGFR solution was concentrated to 10 mg/ml, as determinedusing the extinction coefficient (0.79) at 280 nm, using Centricon-10.An equimolar amount of recombinant hEGF (Pepro-Tech) in an equimolaramount was purified by reverse phase HPLC, freeze-dried, and thendissolved in the dEGFR solution. Crystallization was initiated by thehanging-drop vapor diffusion method starting from 3 μl of 11 mg/mlcomplex solution and 3 μl of reservoir buffer (15% PEG4000, 1% PEG6000,0.075 M Tris-HCl (pH 8.4), 0.075 M sodium acetate, and 0.2 M sodiumchloride). Crystals were grown at 20° C. for 3 weeks. The macro seedingmethod was used to reproducibly grow the crystal to 1×0.2×0.2 mm ormore. We measured XAFS spectra and collected diffraction data set usingBL41XU at the large synchrotron radiation facility, SPring-8. Inaddition, measurement was conducted at 100 K in the presence ofcryoprotectant (18% PEG4000, 1.2% PEG6000, 0.09 M Tris-HCl (pH 8.4),0.09 M sodium acetate, 0.24 M sodium chloride, 16% trehalose and 11%PEG400).

(2) Experiment of Chemical Cross-Linking

To conduct an experiment on EGF-induced dEGFR dimerization similarly forthe case of wild-type EGFR, we conducted analysis by chemicalcross-linking (FIG. 6).

0.8 mg/ml dEGFR was incubated with 0.07 mg/ml hEGF dissolved in 0.05 MHepes-NaOH (pH 7.5) and 1 mM bis(sulfosuccinimidyl) substrate (BS3,Pierce Chemical) cross-linking agent. The reaction mixture was allowedto stand for 60 minutes, and then cross-linking reaction was quenchedusing 0.01 mM glycine. The cross-linking product was analyzed bySDS-PAGE.

As a result, in the presence of EGF, a single species with highmolecular weight (corresponding to a dimeric size of 150 kD) was formed.In the absence of EGF, species with high molecular weight as describedabove was not detected, suggesting that dEGFR had formed a dimer inresponse to EGF.

(3) Result (Crystallization of Native Crystal and Data Collection)

The sparse matrix method was applied for screening for crystallizationconditions. Using receptors that had been caused to form complexes with11 mg/ml hEGF in an equimolar amount by the hanging drop vapor diffusionmethod, Crystal Screen Kit I and II (Hampton Research) were used at 20°C. As a result, the most promising conditions (No. 22 of Screen Kit I)were improved as conditions for growing crystals having diffractionability. As a result of refining crystallization conditions, thecrystals generally grew to a final size within approximately 4 weeks(FIG. 7). The crystal diffracted to approximately 3.5 Å with a mosaicityas high as 1°. To improve the crystal quality, the crystal was immersedin cryoprotectant, cooled stepwise to 0° C., and then allowed to standon ice. The mosaicity decreased to approximately 0.5°. The solventcontent of the native crystal is 75% and the native crystal belongs tothe space group of P3₁21 with a=b=220.2 Å and c=113.1 Å.

EXAMPLE 3 Structure Analysis of Dimeric EGF-EGFR Complex

(1) Method

A native crystal and a selenomethionyl crystal belong to the space groupof P3₁21 with a=b=220.2 Å and c=113.1 Å and with a=b=221.2 Å and c=114.2Å, respectively. The asymmetric unit contains one 2:2 complex with 75%solvent content. The crystals were cautiously transferred by many stepsfrom a harvest buffer (1.2-fold concentration of reservoir buffer) tocryoprotectant prepared by adding 16% trehalose and 11% PEG400 toharvest buffer. Before cryopreservation with liquid nitrogen, thecrystals were slowly cooled to 4° C., followed by placing the crystalson ice for 2 days. This treatment decreased the mosaicity of thecrystals. Diffraction data sets were collected from the cryopreservedcrystals at 100 K on beamline BL41XU at SPring-8. The program HKL wasused for data processing.

The scaled data set of the selenomethionyl crystals at a peak wavelengthwas used to calculate the normalized structure factor with the programDREAR™, and the program SnB was used for locating the selenium atoms.Therefore, the 13 consistent peaks were picked out of the 20 atomsexpected to be in the asymmetric unit. The program MLPHARE™ was used forthe heavy atom refinement and the initial phase calculation, and theprogram RESOLVE™ was used for the solvent flattening. Moreover, theprogram DM™ was used for non-crystallographic averaging and furthersolvent flattening. At this time, we traced the backbone in theresulting electron density obtained at 4.0 Å resolution. The program Owas used for model building. Next, phase extension was conducted at 3.5Å. The molecular replacement analysis was conducted with the programAMORE™ using the model determined by the MAD method as a search model.Using data collected at 3.5 Å, the native crystal structure wasdetermined. The structure is shown in Table 1. Furthermore, using thedata at 3.3 Å, the native crystal structure at 3.3 Å resolution wasdetermined. The structure is shown in Table 2. The above model was builtbased on the electron density, and then refined by rigid-bodyrefinement, energy minimization, restrained β-factor refinement, andsimulated annealing procedures with the program CNS™. Moreover, theprograms SIGMAA™ and RESOLVE™ were used for further electron densitymodification to have less bias toward the model. Iterative modelbuilding and density modification enhanced the quality of the electrondensity map. After the final refinement, the Ramachandran plot analysiswith the program PROCHECK™ showed that 95.9% of the residues in thecrystal structure shown in Table 1 and 95.5% of the residues in thecrystal structure shown in Table 2 are in the most favored andadditionally allowed regions. The programs, SIGMAA™ and PROCHECK™, aresupported by CCP4.

In the crystal structure shown in Table 1, amino acid residues 1 and 2and 513 to C-terminus of one receptor molecule, and amino acid residues1 to 4 and 513 to C-terminus of the other receptor molecule in the dimerare disordered, and these regions were poorly defined in the electrondensity map. The electron densities for residues 1 to 4 and 50 to 53 inboth ligand molecules were diffused. The crystal structure shown inTable 1 contains 1108 amino acids and 11 hydrocarbon residues. In thecrystal structure shown in Table 2, amino acid residues 1, 158 to 162,169 and 170, 179 and 180, 302 to 308, and 513 to the C-terminus of onereceptor molecule and amino acid residues 1 and 2, 158 to 160, 179, 305to 309, and 513 to the C-terminus of the other receptor molecule in thedimer are disordered. The electron densities for residues 1 to 4 and 50to 53 in both ligand molecules are diffused. The crystal structure inTable 2 contains 996 amino acids, 10 hydrocarbon residues, and 79 watermolecules.

(2) Result (Selenomethionyl dEGFR)

Both screening for a heavy atom derivative from the native crystal bythe immersion method, and a series of expression, purification, andcrystallization of dEGFR having selenomethionine instead of methioninewere examined. We could not identify any potential heavy atomderivatives by the immersion method, however, we obtained aselenomethionyl crystal in a manner same as that for the native crystal.The selenomethionyl crystal belonged to the space group of P3₁21 witha=b=221.2 Å and c=114.2 Å and 75% solvent content. The crystaldiffracted to 4.0 Å.

EXAMPLE 4 Homology Modeling Using Structure Coordinates of EGF-EGFRComplex

(1) Construction of Model Structure

The model structure of ErbB2 was built by using FAMS (Full AutomaticModeling System)(Ogata, K. et. al., J. Mol. Graph. Model., Vol. 18, pp.258-272 (2000)). The amino acid sequence information of ErbB2 (EBR2HUMAN) registered with SWISS-PROT was used to construct the initialmodel. At this time, as three-dimensional structure coordinates used forbuilding a backbone structure by FAMS, the structure coordinatesobtained by the our crystal structure of the EGF-EGFR complex were used.Furthermore, as database used for building side chain structure, FAMSoriginal database was used. As a result, structure coordinates of aportion of 24th to the 541st amino acid residues of an amino acidsequence shown in SEQ ID NO: 5 were obtained.

After the construction of the model structure, it was verified thatthere are no theoretical problems on the thus constructed coordinatesusing a program for evaluating protein three-dimensional structure suchas Profiles-3D that is a module of InsightII (Accelrys Inc., San Diego,Calif.) molecular design-supporting software package. This modelstructure is shown in FIG. 12.

Modeling was conducted by same techniques also for ErbB3 (SEQ ID NO: 6),ErbB4 (SEQ ID NO: 7), the insulin-like growth factor-1 receptor (SEQ IDNO: 3), and the insulin receptor (SEQ ID NO: 4), thereby obtaining modelstructures.

(2) Prediction of Ligand-Binding Site

Structure conserved region (SCR) of a modeling structure wassuperimposed on that of the EGFR three-dimensional structure. Whilecomparing with the alignment table used for the initial structureconstruction, amino acid residues in the modeling structure, which arepresent at positions almost the same in terms of three dimensionalcoordinates as those of amino acids in EGFR interacting with EGF wereextracted. These amino acid residues were considered to interactingamino acid residues. The results are shown in FIG. 11.

EXAMPLE 5 Generation of Pharmacophores

An example of a pharmacophore can be specified as follows using theinformation on the amino acid residues extracted from the information onthe complex structure. The following interaction is exemplified asinteraction that is thought to be particularly useful for pharmacophoregeneration (Table 4).

TABLE 4 EGF amino EGFR amino Possibility of acid residue acid residueBinding type pharmacophore Arg45 Gln384 Hydrogen bond High formationbetween main chain and side chain Asp46 Agr29 Salt linkage Particularlyhigh Leu47 Leu382 Hydrophobic interaction Particularly high Ala415Val417

The interaction contains 3 interaction sites that are closely adjacentto each other and capable of interacting with both domains of domain Iand domain III. Hence, it is likely to be specified as a pharmacophorecapable of searching not only an EGFR antagonist, but also an EGFRagonist. A specific example of such a pharmacophore is specified by theactive conformation of tripeptides, Arg45, Asp46, and Leu47, uponbinding with EGFR.

Next, the EGF structure was read into a commercial software package forvirtual screening, such as Catalyst, and then interacting atoms orfunctional groups were converted into appropriate spheres representingpharmacophoric features as shown in Table 5 below using thefunction-mapping function of Catalyst, thereby generating apharmacophore.

TABLE 5 Spheres representing Residues on EGF pharmacophoric featuresArg45: NH group of main chain Hydrogen Bond Donor (HBD) region Asp46:carboxyl group of side Negative Ionizable (NI) region chain Leu47: sidechain Hydrophobic (HP) region

The pharmacophore is specified by the coordinates of the following 3spheres representing pharmacophoric features.

Sphere 1 representing pharmacophoric features; the negative ionizableregion having a center represented by an X-coordinate of −5.623, aY-coordinate of 6.259, and a Z-coordinate of 0.853, and a radius of 1.5Å.

Sphere 2 representing pharmacophoric features; the hydrophobic regionhaving a center represented by an X-coordinate of −12.656, aY-coordinate of 3.363, and a Z-coordinate of −2.934, and a radius of 1.5Å.

Sphere 3 representing pharmacophoric features; the hydrogen bond donorregion which is defined by a vector represented by a hydrogen bond donorregion R (route) as a start point having a center represented by anX-coordinate of −11.576, a Y-coordinate of 9.546, and a Z-coordinate of−2.135, and a radius of 1.5 Å, and a hydrogen bond donor region T(terminal) as an end point having a center represented by anX-coordinate of −13.86, a Y-coordinate of 8.532, and a Z-coordinate of−3.792, and a radius of 2.0 Å.

The pharmacophore is shown in FIG. 13.

Example 6 Screening Method Using Structure Coordinates and Pharmacophoreof EGF-EGFR Complex

(1) Virtual Screening Utilizing DOCK

(1-1) Specifying Search Site

The three dimensional structure of the structure coordinates in Table 1obtained from the results of analysis made on the EGF-EGFR co-crystalstructure was visually displayed using software InsightII (AccelrysInc.,). Conolly surface views of EGF (ligand) and EGFR (receptor) areshown in FIG. 14.

When FIG. 14 was observed, among residues extruding convexly on EGF,residues shown with circular symbols in this figure were shown tocorrespond to the pockets on EGFR. These portions are thought to bepromising ligand-binding site candidates.

Hence, based on the above information, three sites (site 1, site 2, andsite 3) on EGFR were supposed to be ligand-binding sites, and DOCKspheres were set at positions of major atoms on EGF amino acid residuesthought to interact with each site. The sites used for this dockingstudy are shown in Table 6.

TABLE 6 Important residues Major interacting residues Site (EGF side)Type (EGFR side) 1 Leu26 Hydrophobic Ala68, Leu69, Leu98 Lys28 BasicGlu35 2 Leu15 Hydrophobic Leu325, Val350 Cys31 Hydrogen Gln16 bondingAsn32 Hydrogen Gln16, Gly18 bonding Cys33 Hydrogen Gly18 bonding Ile38Hydrophobic Leu17, Thr10, Leu27 Gly39 Hydrogen Asn12 bonding Arg41 BasicAsp355 3 Tyr44 Hydrophobic Leu382, His346, Arg45 Hydrogen Gln384 bondingAsp46 Acidic Arg29 Leu47 Hydrophobic Ala415, Ile438, Phe412, Val417,Leu382, Gln408

(1-2) Construction of Compound Database

As a compound database to be searched, compound information registeredwith ACD (Available Chemicals Directory; MDL Information Systems, Inc.)was used. First, the ACD compound information was exported in the sdfile format. Then data processings such as salt removal, generatingthree-dimensional data for compound structures by CORINA (MolecularNetworks GmbH), file conversion to “mol2” file format using accessoriesattached to DOCK, and charge allocation were carried out, therebygenerating a compound database for DOCK (170,855 compounds).Furthermore, data in the compound database constructed by the abovetechniques was divided into a compound group (acd_a: 19,270 compounds)containing acidic functional groups and a compound group (acd_b: 151,585compounds) containing no such groups. These compound groups wereanalyzed separately.

(Note: DOCK tends to estimate scores of acidic functional groupsunreasonably high. Thus, in this study, the above processings werecarried out, and analyses were made separately.)

(1-3) Implementation of Docking Study

DOCK4.0.1 was conducted for each site on the compound database generatedin (1-2). Docking was performed by a flexible docking method by whichligand conformation is generated upon calculation.

The obtained results were sorted in ascending order of docking scores(lower the docking score (the function for evaluating ligand-proteinbinding energy), the better fitting with the protein), therebyextracting top 5000 compounds. Of these compounds, compounds havingconformation adjacent to the important residues were extracted ((1) inTable 7) with a self-made script (script for calculating the distancebetween an important EGFR residue and a ligand), and then compoundshaving conformation by which appropriate interaction with each importantresidue could take place were selected by visual observation ((2) inTable 7). Furthermore, compounds having structures unfavorable as drugsand compounds (e.g., pigments) likely to show false-positive results atthe time of assaying activity were removed, thereby obtaining finallyselected compounds ((3) in Table 7). (a) denotes acd_a, and (b) denotesacd_b.

TABLE 7 Number of Number of selected selected Residues used forcompounds (a) compounds (b) Sites extraction (1)→(2)→(3) (1)→(2)→(3) 1Leu26, Lys28 586→93→68 501→91→59 2 Tyr44, Arg41 138→16→7 267→11→4 3Leu47 and Asp46 and 316→128→82 234→86→38 Arg45

(1-4) Cluster Analysis of Hit Compounds in Docking Study

The hit compounds selected in (1-3) contain compound groups that arestructurally (backbone) similar to each other. These compound groups arehighly likely to be similar to each other in their docking scores anddocking modes. Therefore, simply selecting assay candidate compounds indescending order from docking scores, the resulting set is likely to bepoor in structural variety.

When structural variety is poor, the compound groups are likely to besimilar to each other in safety and ADME (absorption, distribution,metabolism, and excretion) profiles. However, it is difficult to predictin silico safety and ADME profiles with current technology. When such acompound set is determined as an assay candidate compound set, thesecompounds are highly likely to be unable to cope with a risk of drop outdue to an unexpected side effect, which often takes place during drugdevelopment.

Hence, cluster analysis made on each hit compound of (1-3) whilefocusing on structural similarity and classification of the compoundsinto several structurally similar groups make it possible to securestructural variety (see Tables 8 and 9 below). One representativecompound selected from each group may be subjected to biochemical assay.

In addition, optclus (Barnard Chemical Information Ltd.) was used fordetermining the number of optimal division groups in each hit, andDaylight Clustering Package (Daylight Inc.) was used for clusteranalysis.

TABLE 8 acd_a Number of hit compounds Site in docking study Number ofgroups 1 68 25 2 7 6 3 82 55

TABLE 9 acd_b Number of hit compounds Site in docking study Number ofgroups 1 59 13 2 4 3 3 38 4(2) Virtual Screening Utilizing DOCK and AUTODOCK

(2-1) Low Molecular Weight Compound Library

A library was generated (307481 molecules in total) from the ACD lowmolecular weight compound database. At this time, when multiplemolecules were contained in a single data entry, the data were dividedinto each data set for one molecule, so as to eliminate redundancy.Subsequently, the data were narrowed down based on “Lipinski's Rule of5.” Conditions for narrowing the data used herein are as follows.

-   -   1) Molecular weight: 100 or more and 500 or less    -   2) Calculation LOGP value (o/w) (XLOGP-1 algorithm was used) 5        or less    -   3) Number of acceptor atoms (the number of N and the number of 0        contained in a compound) 10 or less    -   4) Number of donor atoms (number of NH and number of OH        contained in a compound) 5 or less        Furthermore, molecules as shown below were excluded in order to        appropriately conduct docking calculation.    -   1) Number of rotatable single bonds is 21 or more.    -   2) Radicals are contained.    -   3) Elements other than H, C, N, O, F, S, P, Cl, Br, and I are        contained.

As a result of this narrowing-down procedure, a low molecular weightcompound set consisting of 222096 molecules was obtained.

(2-2) Binding Site Prediction

(2-2-1) Domain Definition

Chain A of EGFR was divided into 3 domains (domain I: Glu2-Lys165,domain II: Cys166-Val312, and domain III: Cys313-Val512). Domains I andIII contain an interface between EGF and EGFR, and domain II contains aninterface for EGFR dimerization.

(2-2-2) Site Definition

EGF and EGFR create intermolecular interaction at three sites. Aninterface created by Leu14, Gln16, Gly18, Tyr45, Leu69, Glu90, and Leu98of EGFR and Met21, Ile23, Leu26, Lys28, Cys31, Asn32, and Cys33 of EGFis defined as site 1. An interface created by Val350, Asp355, and Phe357of EGFR and Tyr13, Leu15, and Arg41 of EGF is defined as site 2. Aninterface created by Leu382, Gln384, Phe412, and Ile438 of EGFR andGln43, Arg45, and Leu47 of EGF is defined as site 3. Site 1 is containedin domain I, and site 2 and site 3 are contained in domain III. Here, alow molecular weight compound inhibiting the intermolecular interactionwas searched for each site by in silico screening.

(2-2-3) Prediction of Binding-Site

Binding sites were predicted for each of the above 3 sites using aprogram Sphgen contained in a docking program Suite Dock.

On molecular surface, Sphgen generates a sphere having a radiuscorresponding to geometrical shape of the molecule. Multiple overlappingspheres are collected as one cluster, and then finally multiple clustersthat are independent to each other are generated. Regions occupied bythese clusters are predicted binding sites. This calculation wasconducted using predetermined default values of Sphgen.

For site 1, from among clusters adjacent to this site, a hydrophobicenvironment composed of Leu14, Tyr45, Leu69, and Leu98 of EGFR andMet21, Ile23, and Leu26 of EGF, and those adjacent to salt bridgeslinking Glu90 residue on the EGFR side and Lys28 residue on the EGF sidewere selected. Among spheres contained in these clusters, spheressubjected to shaping so that those arranged at the hydrophobic pocketsand the peripheral portions thereof had been deleted were selected asputative binding sites for site 1.

For site 2, from among spheres contained in the clusters adjacent tosite 2, spheres wherein the distance from each of Val350, Asp355, andPhe357 of EGFR and Tyr13, Leu15, and Arg41 of EGF creatingintermolecular interaction to a center of each sphere was within 4 Åwere selected.

Also for site 3, from among spheres contained in clusters adjacent tosite 3, spheres where the distance from each of Gln384, Phe412, andIle438 of EGFR and Gln43, Arg45, and Leu47 of EGF creatingintermolecular interaction to a center of each sphere was within 4.4 Åwere similarly selected.

(2-3) Docking Simulation Conditions

Docking simulation was conducted (primary screening) by docking all the222096 molecules of the low molecular weight compounds contained in thelibrary generated in (2-1) to each binding site defined in (2-2) usingsoftware Dock 4.0. To improve calculation accuracy, among parameterspredetermined for DOCK, the following parameters were changed from theirdefault values.

configurations_per_cycle=30 (default 25)

maximum_orientations=1000 (default 500)

More detailed docking was conducted (secondary screening) using AutoDock3.0.5 for top 10347 molecules for site 1, top 16929 molecules for site2, and top 10500 molecules for site 3. Here, the following parameterswere also changed to improve calculation accuracy.

ga_num_evals=1500000 (default 250000)

(2-4) Docking Simulation Result

For the docking structures of the low molecular weight compoundsobtained as a result of the secondary screening by AutoDock, thedistance between each of Ile23, Leu26, and Lys28 of EGF for site 1,Tyr13, Leu15, and Arg41 of EGF for site 2, and Gln43, Arg45, and Leu47of EGF for site 3 and a center of each sphere (as used in the primaryscreening) was calculated. Compounds in which the shortest distance was4 Å or more were excluded, because they were predicted to bind at aposition away from a target binding site. Subsequently, the lowmolecular weight compounds were sorted in order of AutoDock scores.Thus, top 2500 molecules for site 1, top 808 molecules for site 2, andtop 879 molecules for site 3 were determined to be ligand candidatecompounds that bind to a protein. Subsequently, these compounds wereclassified based on interaction manners with the protein. Here, thedistance between the protein and a heavy atom (atoms other thanhydrogen) of each compound was calculated. Among residues being incontact with each compound at a distance within 4 Å, residueselectrostatically interacting with each compound were determined to be“important interacting residues,” and residues other than such residueswere determined to be “interacting residues.” Compounds that wereanalogous to the “interacting residues” and “important interactingresidues” in appearance pattern on the primary sequence of the proteinwere classified into one cluster. It is expected that a compoundinteracting with EGFR residues that interact with EGF has higherinhibition activity. Hence, it was considered that compounds containedin cluster 2 (319 compounds) for site 1, those contained in cluster 2(188 compounds) for site 2, and those contained in clusters other thanclusters 5, 6, and 12 (871 compounds) for site 3 had high inhibitionactivity.

(2-5) Cluster Analysis of Hit Compounds in Docking Study

Cluster analysis was conducted for the hit compounds in (2-4) by themethod according to (1-4). Results are shown in Table 10.

TABLE 10 Number of hit compounds Site in docking study Number of groups1 319 33 2 188 33 3 871 33

(3) Virtual Screening Utilizing Catalyst

(3-1) Specification of Search Site (Generation of Catalyst Hypothesis)

The three-dimensional structure of the structure coordinates in Table 1obtained from analytical results for the co-crystal structure wasvisually displayed using software InsightII™ (Accelrys Inc., San Diego,Calif.), and then the EGF-EGFR interaction sites were visually observed.As a result, pharmacophoric patterns thought to be important wereextracted from EGF (ligand). The pharmacophoric hypotheses extracted andutilized for search are as shown below. In addition, since a Catalyst™hypothesis with shape taken into consideration requires at least 3pharmacophores, a hypothesis was generated from a pharmacophoric patternwherein a pharmacophore is located at a hydrophobic pocket and contains2 or more interaction sites. Spheres representing pharmacophoricfeatures were determined by reading extracted partial structures into acommercial software package for virtual screening such as Catalyst™, andthen converting data of interacting atoms or functional groups intoappropriate spheres representing pharmacophoric features using thefunction-mapping function of Catalyst™. Thus, pharmacophores weregenerated. Furthermore, the EGF partial structures utilized forextracting the pharmacophores were caused to fit correspondingpharmacophores, and then the EGF partial structures were converted tohypothetical molecular shapes using the “Convert Molecule to Shape”function of Catalyst™. By linking the hypotheses and originalpharmacophores, pharmacophoric hypotheses taking similarity inthree-dimensional molecular shape into consideration were constructed.

In addition, molecular shapes were generated by a default method.Furthermore, when residues corresponding to each other are distant fromeach other in terms of primary structure (e.g., Leu15 and Arg41), theirbinding was simulated at a portion closest to each residue, and thenshape construction was conducted.

(a) Leu26, Asp27, and Lys28 (Site 1)

Sphere 1 representing pharmacophoric features (Leu26 side chain), thehydrophobic region having a center represented by an X-coordinate of−5.669, a Y-coordinate of −1.630, and a Z-coordinate of −1.200, and aradius of 1.5 Å.

Sphere 2 representing pharmacophoric features (carboxyl group of Asp27side chain), the negative ionizable region having a center representedby an X-coordinate of −4.547, a Y-coordinate of 6.304, and aZ-coordinate of 4.060, and a radius of 1.5 Å.

Sphere 3 representing pharmacophoric features (amino group of Lys28 sidechain), the positive ionizable region having a center represented by anX-coordinate of −0.842, a Y-coordinate of 2.938, and a Z-coordinate of0.169, and a radius of 1.5 Å.

(b) Ile38, Gly39, Glu40, and Arg41 (Site 2)

Sphere 1 representing pharmacophoric features (guanidyl group of Arg41side chain), the positive ionizable region having a center representedby an X-coordinate of 4.527, a Y-coordinate of 6.119, and a Z-coordinateof −0.725, and having a radius of 1.5 Å.

Sphere 2 representing pharmacophoric features (carboxyl group of Glu40side chain), the negative ionizable region having a center representedby an X-coordinate of 0.459, a Y-coordinate of −1.861, and aZ-coordinate of 1.410, and a radius of 1.5 Å.

Sphere 3 representing pharmacophoric features (NH group of Gly39backbone), the hydrogen bond donor region of which is defined by avector represented by a hydrogen bond donor region R (route) as a startpoint having a center represented by an X-coordinate of −6.534, aY-coordinate of 3.496, and a Z-coordinate of −0.247, and a radius of 1.5Å, and a hydrogen bond donor region T (terminal) as an end point havinga center represented by an X-coordinate of −8.045, a Y-coordinate of1.617, and a Z-coordinate of 1.538, and a radius of 1.7 Å.

Sphere 4 representing pharmacophoric features (Ile38 side chain), thehydrophobic region having a center represented by an X-coordinate of−11.024, a Y-coordinate of 2.667, and a Z-coordinate of 0.128, and aradius of 1.5 Å.

(c) Leu 15, Arg41, and Tyr44 (Site 2)

Sphere 1 representing pharmacophoric features (Tyr44 side chain), thering aromatic region of which is defined by a vector represented by aring aromatic region R (route) as a start point having a centerrepresented by an X-coordinate of −5.416, a Y-coordinate of 7.542, and aZ-coordinate of −2.184, and a radius of 1.5 Å, and a ring aromaticregion T (terminal) as an end point having a center represented by anX-coordinate of −8.185, a Y-coordinate of 6.587, and a Z-coordinate of−2.827, and a radius of 1.5 Å.

Sphere 2 representing pharmacophoric features (guanidyl group of Arg41side chain), the positive ionizable region having a center representedby an X-coordinate of −16.119, a Y-coordinate of 9.326, a Z-coordinateof −0.341, and a radius of 1.5 Å.

Sphere 3 representing pharmacophoric features (Leu15 side chain), thehydrophobic region having a center represented by an X-coordinate of−14.846, a Y-coordinate of 5.806, and a Z-coordinate of −1.676, and aradius of 1.5 Å.

(d) Tyr44, Arg45, and Leu47 (Site 3)

Sphere 1 representing pharmacophoric features (Tyr44 side chain), thering aromatic region of which is defined by a vector represented by aring aromatic region R (route) as a start point having a centerrepresented by an X-coordinate of 6.595, a Y-coordinate of −0.996, and aZ-coordinate of −1.663, and a radius of 1.5 Å, and a ring aromaticregion T (terminal) as an end point having a center represented by anX-coordinate of 7.625, a Y-coordinate of 1.482, and a Z-coordinate of−0.322, and a radius of 1.5 Å.

Sphere 2 representing pharmacophoric features (NH of Arg45 backbone),the hydrogen bond donor region of which is defined by a vectorrepresented by a hydrogen bond donor region R (route) as a start pointhaving a center represented by an X-coordinate of 1.858, a Y-coordinateof 1.046, and a Z-coordinate of 1.370, and a radius of 1.5 Å, and ahydrogen bond donor region T (terminal) as an end point having a centerrepresented by an X-coordinate of 2.051, a Y-coordinate of 2.626, and aZ-coordinate of −1.172, and a radius of 1.7 Å.

Sphere 3 representing pharmacophoric features (Leu47 side chain), thehydrophobic region having a center represented by an X-coordinate of−1.936, a Y-coordinate of 0.021, and a Z-coordinate of −3.278, and aradius of 1.5 Å.

Here, a root is defined by a start point of a vector and a terminal isdefined by an end point of the vector, respectively. The ring plane in aring aromatic region is defined as a plane having an R-to-T vector as aperpendicular. Each arrangement of the spheres representingpharmacophoric features shown in (a) to (d) is graphically expressed asshown in FIGS. 15 to 18.

(3-2) Construction of Compound Database

As a compound database to be searched, compound information registeredwith the ACD (Available Chemicals Directory; MDL Information Systems,Inc.) was used. First, the ACD compound information was exported in thesd file format, and then converted into an exclusive database format(catDB module, Accelrys Inc.), thereby constructing a database (231,777compounds) for Catalyst.

(3-3) Implementation of Catalyst

The compound database constructed in (3-2) was searched for compoundsfitting 4 ((a) to (d)) pharmacophores shown in (3-1) and thepharmacophore (referred to as “e”) generated in Example 5 usingCatalyst4.6 (Accelrys Inc.) ((1) in Table 11). As a result of thissearch, compounds matching all hypothetical spheres contained in eachpharmacophore were selected. Furthermore, compounds having structuresunfavorable as drugs and compounds that were likely to showpseudo-positiveness when assayed (e.g., pigments) for activity wereeliminated, thereby finally selecting compounds ((2) in Table 11).

TABLE 11 Number of hit compounds Hypothesis Pharmacophore (1)→(2) aLeu26, Asp27, Lys28 (site 1) 13→13 b Ile38, Gly39, Glu40, Arg41 (site 2)2→2 c Leu15, Arg41, Tyr44 (site 2) 0 d Tyr44, Arg45, Leu47 (site 3)68→57 e Arg45, Asp46, Leu47 (site 3) 16→12

EXAMPLE 7 EGF-EGFR Binding Inhibition Assay

(1) Production of Europium-Labeled Ligand

When the crystal structure of an EGF-EGFR complex was analyzed, it wasdetermined that Asn, the N-terminal site of a ligand (EGF), does notsignificantly participate in direct interaction with EGFR. Thus, theN-terminus was chemically modified with an europium chelate compound(DELFIA Eu-N1, PerkinElmer life sciences). Synthesis of europium-labeledEGF was consigned to PerkinElmer life sciences, so that 1.5 mg ofeuropium-labeled EGF (hereinafter denoted as Eu-EGF) dissolved in 50mmol/L Tris-HCl bufferd saline (pH 7.8) at a concentration of 48 μmol/Lwas obtained.

(2) EGF-Binding Experiment Using A431 Cells and Binding Inhibition byAntibody

A431 cells (human squamous cell carcinoma cells, purchased from ATCC)for use in this experiment were subcultured using a medium (hereinafterdenoted as a medium) prepared by adding inactivated FBS (JRH),Penicillin-streptmycin (GIBCO), and Amphotelicin B (Sigma) at finalconcentrations of 10 vol %, 50 U/mL-50 μg/mL, and 1 μg/mL, respectively,to DMEM (Sigma). A431 cells suspended in the medium were inoculated intoa 96-well microtiter plate (black, COSTAR) at 10,000 cells/100 μL/well,and then pre-cultured overnight in a CO₂ incubator (37° C., 5% CO₂).After the media within the wells had been completely removed, the wellswere washed once in 50 mmol/L HEPES-HCl (pH 7.8, hereinafter denoted asa Buffer) containing 138 mmol/L NaCl, 5 mmol/L KCl, 1.2 mmol/L MgSO₄,1.2 mmol/L CaCl₂, 75 μmol/L EDTA and 0.2 vol % BSA, and then a Bufferwas added at 49 μL/well. Subsequently, DMSO was added at 1 μL/well, andthen Eu-EGF that had been diluted with a Buffer at variousconcentrations were added at 50 μL/well. The plate was incubated at roomtemperature for 1 hour, washed 5 times with an ice-cold buffer (200μL/well), and then a DELFIA enhancement reagent (PerkinElmer lifesciences) was added at 100 μL/well. After further 30 minutes ofincubation at room temperature, time-resolved fluorescence was measuredusing a plate reader (ARVO-sx, PerkinElmer life sciences). Excitationwavelength and emission wavelength used herein were 340 nm and 615 nm,respectively.

As a result of this experiment, increased fluorescence was observed inan Eu-EGF concentration-dependent manner, and saturation of binding wasalmost observed at a final concentration of 30 nmol/L of Eu-EGF or more(FIG. 19). In addition, non-specific binding was expressed as afluorescence value measured for a well to which unlabeled EGF had beenadded at a final concentration of 50 μmol/L before the addition ofEu-EGF. A value of specific binding was calculated by subtracting ameasured value of non-specific binding from a value measured at eachligand concentration. Furthermore, when Scatchard plot was generatedbased on free ligand concentrations and values of specific binding,Kd=2.2 nmol/L was obtained by calculation from the slope of anapproximate straight line. Furthermore, when anti-EGFR monoclonalantibodies (Ab-3, Oncogene) prepared at various concentrations wereadded, subsequently Eu-EGF was added at a final concentration of 10nmol/L in the above experiment system, fluorescence decreased in anantibody concentration-dependent manner, so that the inhibition of EGFbinding by the antibodies was confirmed (FIG. 20). Based on the aboveresults, it was concluded that screening of compounds inhibitingEGF-EGFR binding is possible using this assay system.

(3) Screening for Compounds Selected by Computer Screening

A representative compounds for site 1 of 2) in Example 6, arepresentative compounds for site 2 of 2) in Example 6, and arepresentative compounds for site 3 of 1) in Example 6 were subjected tothis assay. A431 cells that had been suspended in a medium wereinoculated into a 96-well microtiter plate (black, COSTAR) at 10,000cells/100 μL/well, and then pre-cultured overnight in a CO₂ incubator(37° C., 5% CO₂). Media within the wells were completely removed, andthen the wells were washed once with a Buffer. Subsequently, a Bufferwas added at 49 μL/well. Samples were prepared by weighing 3 to 8 mg oftest compounds and dissolving them in DMSO at a concentration of 10mg/mL. The sample was added at 1 μL/well, incubated at room temperaturefor 5 minutes, and then to which Eu-EGF that had been diluted with aBuffer to a final concentration of 5 nmol/L was added at 50 μL/well. Theplate was incubated at room temperature for 1 hour, washed 5 times withan ice-cold Buffer (200 μL/well), and then to which a DELFIA enhancementagent (PerkinElmer life sciences) was added at 100 μL/well. After 30minutes of incubation at room temperature, time-resolved fluorescencewas measured using a plate reader (ARVO-sx, PerkinElmer life sciences).Excitation wavelength and emission wavelength used herein were 340 nm,and 615 nm, respectively. Assay was run in duplicate, and then data wereprocessed using each average measured value. In data processing, theinhibition activity (%) was calculated by subtracting a measured valueof non-specific binding from a measured value of each group to whichDMSO and samples had been added, calculating a specific ligand bindingratio (%) in a group to which samples had been added when the specificligand binding ratio of a group to which DMSO had been added wasdetermined to be 100% and the non-specific binding ratio was determinedto be 0%, and then subtracting the thus calculated specific ligandbinding ratio from 100%. Furthermore, pseudo-positive samples wereeliminated by the following method. Specifically, DMSO and each samplewere diluted with a Buffer, and then incubated with A431 cells at roomtemperature for 1 hour, followed by 5 times of washing. Eu-EGF and aDELFIA enhancement reagent (PerkinElmer life sciences) were added,incubation was conducted at room temperature for 30 minutes, and thentime-resolved fluorescence was measured using a plate reader (ARVO-sx,PerkinElmer life sciences). Compounds with clearly decreasedfluorescence compared with that for the measured value of the group towhich DMSO had been added were eliminated as pseudo-positive. Compoundsother than these compounds were determined to be active. The activecompounds were added at final concentrations of 1, 3, 10, and 30 μg/mL,inhibition activities were calculated, and then IC₅₀ values werecalculated by the nonlinear least square fitting.

(4) Result

From the representative compounds for site 1, a compound showingIC₅₀=8.0 μg/mL:10-[3-(4-methyl-piperazine-1-yl)-propyl]-2-trifluoromethyl-10H-phenothiazinedihydrochloride (purchased from Sigma, catalog number: T8516) wasobtained. From the representative compounds for site 2, a compoundshowing IC₅₀=13.0 μg/mL:8-hexylsulfanyl-3-methyl-7-propyl-3,7-dihydro-purine-2,6-dione(purchased from SALOR, catalog number: R33,683-1) was obtained. From therepresentative compounds for site 3, a compound showing IC₅₀=23.8 μg/mL:2-[2-(3-ethyl-5,5,8,8-tetramethyl-5,6,7,8-tetrahydro-naphthalene-2-yl)-2-oxo-ethylsulfanyl]-nicotinicacid (purchased from MAYBRIDGE, catalog number: RH00866) was obtained.

EXAMPLE 8 Evaluation of EGFR Agonist Activity and EGFR AntagonistActivity Using EGFR Phosphorylation as an Index

When EGFR is activated by binding of a ligand, intracellular regions arephosphorylated. This phosphorylation is detectable using aphosphorylated receptor-specific antibody. A431 cells were cultured(BNR-110M, ESPEC CORP.) using Dulbecco's Modified Eagle's Medium (Sigma)containing 10% fetal calf serum (JRH Bioscience) under an environment of5% CO₂ at 37° C. for 1 day. The cells were then cultured in a serum-freemedium for 1 day. When agonist activity was assayed, a medium containinga sample at a concentration of 10 μg/mL was added, and then theresultant was allowed to stand for 10 minutes. Furthermore, whenantagonist activity was assayed, a medium containing a sample at aconcentration of 10 μg/mL or 100 μg/mL was added, and then the resultantwas allowed to stand for 20 minutes. Subsequently, EGF (Pepro Tech,Inc.) was added to a final concentration of 100 ng/mL, and then theresultant was further allowed to stand for 10 minutes. In addition,cells that had been treated with EGF only and untreated cells wereprepared as controls. The media were removed, the cells were washed withcooled phosphate buffered saline (PBS, Sigma), and then a heated celllysis buffer (125 mmol/L Tris (Wako Pure Chemical Industries,Ltd.)-hydrochloric acid (Wako Pure Chemical Industries, Ltd.) buffer (pH6.8) containing 20% glycerol (Wako Pure Chemical Industries, Ltd.), 2%SDS (Sigma), 5% 2-Mercaptoethanol (Sigma), and 0.025 mg/mL Bromo PhenolBlue (Sigma)) was added, thereby obtaining a cell lysate solution. Theobtained cell lysate was applied to 7.5% polyacrylamide gel (e-PAGEL,ATTO Corporation), and then subjected to separation by electrophoresis(AE-6400, ATTO Corporation) under a condition of 30 mA per sheet of gelusing Tris/glycine/SDS buffer (Bio-Rad Laboratories, Inc.). Theresultant was transferred to a polyvinylidene difluoride (PVDF) membrane(clear blot membrane-P, ATTO Corporation) under conditions of 200 mA for1 hour using semi-dry blotting system (AE-6675, ATTO Corporation). ThePVDF membrane was blocked using PBS containing 5% skim milk (Wako PureChemical Industries, Ltd.) at room temperature for 1 hour. Afterblocking, the membrane was incubated with a primary antibody against anactivated EGF receptor (anti-human activated EGFR antibody•mouse IgG1:BD Transduction Laboratories) diluted 1:1000 with PBS containing 0.1%Tween-20 (Bio-Rad Laboratories, Inc.), and then with a secondaryantibody labeled with horseradish peroxidase (anti-mouseimmunoglobulin•rabbit polyclonal antibody/peroxidase label: DAKOCorporation) at 37° C. for 1 hour. Subsequently, the EGF receptoractivated by the action of EGF was detected by a chemiluminescencemethod (ECL western blotting detection system, Amersham Biosciences)using a system for photographing luminescence and fluorescence andoutputting the data (AE-6962N, ATTO Corporation). 10 μg/mL10-[3-(4-methyl-piperazine-1-yl)-propyl]-2-trifluoromethyl-10H-phenothiazinedihydrochloride, 10 μg/mL8-hexylsulfanyl-3-methyl-7-propyl-3,7-dihydro-purine-2,6-dione, and 100μg/mL2-[2-(3-ethyl-5,5,8,8-tetramethyl-5,6,7,8-tetrahydro-naphthalene-2-yl)-2-oxo-ethylsulfanyl]-nicotinicacid inhibited activation of the EGF receptor of A431 cells due to EGF(FIG. 21).

EXAMPLE 9 Production of EGFR Variant

(1) Method

The site-directed mutagenesis method was performed using theQuickChange® site-directed mutagenesis kit according to the instructionsof the manufacturer (Stratagene). The mutation in the EGFR sequence wasconfirmed by DNA sequence analysis. CHO cells were cultured in MinimumEssential Medium α supplemented with 10% calf serum. By using theFuGENE6 method according to the manufacturer's recommendation (Roche),the cells were transfected with an expression vector carrying the wildtype or variant receptor. The transiently transfected CHO cells werecultured for 24 hours, unless otherwise described, and were used for thenext analyse with and without a 24 hour starvation, followed bystimulation with human EGF for 5 minutes.

The expression of EGFR and the downstream ERKs activation in thetransiently transfected CHO cells were determined according to Kim et al(Kim, J.-H. et al., Eur. J. Biochem. 269: 2323-2329, 2002). Moreover,immunoblot analyse were performed using an anti-EGFR polyclonal antibody(Upstate Biotechnology), an anti-phosphorus-p44/42 MAP kinase E10monoclonal antibody, and an anti-MAP kinase polyclonal antibody (CellSignaling Technology).

Next, for the wild type and ERK activation-deficient mutants, the EGFRexpression on the surface of the transiently transfected CHO cells wasconfirmed by the 2 different methods shown below. In the 1^(st) method,the cell surface proteins were biotinylated, and then the biotinylatedEGFR was detected according to the method described by Muthuswamy et al(Muthuswamy, S. K. et al., Mol Cell Biol. 19: 6845-6857, 1999). In the2^(nd) method, the cell surface EGFR of the transiently transfected CHOcells was labeled with the anti-EGFR 528 monoclonal antibody (SantaCruz), and then was analyzed using a FACS Vantage SE system (BectonDickinson) according to the instructions of the manufacturer.

To detect autophosphorylation induced by EGF, the transientlytransfected cells were cultured for 12 hours, and then starved inserum-free medium for 3 hours. EGFR autophosphorylation was determinedby the method previously reported by Sato et al (Sato, C. et al., J.Biochem (Tokyo) 127: 65-72, 2000). As for the EGF-binding assay, thetransiently transfected cells were maintained for 24 hours in a 24-wellplate coated with fibronectin, and then starved for 24 hours inserum-free medium. These cells were incubated for 1 hour with 2 nM¹²⁵I-labeled EGF in phosphate buffered saline (PBS) containing 1 mg/mlBSA. The free ¹²⁵I-labeled EGF was removed by 3 times of washing usingice-cold PBS containing 1 mg/ml BSA. Subsequently, the cells were lysedin 0.5 ml of 0.5 M NaOH, and then radioactivity was measured using a γcounter.

(2) Result

Mutagenesis was conducted for the interface residues of EGFR, and thenthe effect on the activation of full-length EGFR was examined. First, toexamine the downstream signal transduction pathway of EGFR expressed inCHO cells, EGF-dependent phosphorylation of the extracellularsignal-regulated protein kinases (ERK) was assayed (FIG. 22A). In thehydrophobic receptor-dimerization interface, Tyr251 of one receptorinteracts with Arg285 by hydrogen bond, and hydrophobically interactswith Phe263 of the other receptor. The substitution of Arg285 with Tyr(R285Y) had a negligible effect, but the substitution of Arg285 with Ser(R285S) reduced the biological activity (FIG. 22A).

Substitution of Tyr251 or Phe263 with Ala had negligible effects.However, the combination of R285S mutation with either Y251A or F263Aresulted in almost complete loss of the activity. It was confirmed thata variant generated by a combination of R285S mutation with either Y251Aor F263A was expressed on the cell surface at a level almost the same asthat of wild type EGFR by both biotinylation analysis (FIG. 22B) andFACS analysis (data not shown). These nonsignaling forms were alsoincapable of EGFR autophosphorylation (FIG. 22C). As a result, theinterface for receptor dimerization found in the crystal structure wasverified by the site-directed mutagenesis. The two dimerizationinterface mutants showed much lower affinity for EGF in ¹²⁵I EGF-bindingassay (FIG. 22D). This suggests that the high-affinity of EGF binding isrelated to the EGFR dimerization.

Furthermore, an intracellular interaction between domain II and domainIII of EGFR was also tested utilizing the site-directed mutagenesismethod. At the domain-domain interface, Glu293 of domain II and Arg405of domain III form salt bridge with each other. It was revealed that thesubstitution of Arg405 with Glu abolished the EGF binding due toEGF-dependent ERK phosphorylation, autophosphorylation, and highaffinity EGF-binding (FIG. 22). Hence, the domain-domain interaction isimportant for the receptor activation.

INDUSTRIAL APPLICABILITY

The present invention provides the crystal structure and thethree-dimensional structure of a dimeric EGF-EGFR complex. The structurecoordinates provided by the co-crystal structure of the complex of thepresent invention are useful in extraction of a pharmacophore of an EGFRagonist or an EGFR antagonist, computer screening using all or some ofthe structure coordinates of the complex, molecular design (e.g.,increased activity and provision of selectivity) of an EGFR agonist oran EGFR antagonist, design of an industrially useful EGF or EGFRvariant, production of an EGF neutralization antibody or an EGF agonistantibody, a molecular replacement method utilizing the EGF-EGFR crystalstructure, modeling of proteins thought to have folds similar to thoseof EGFR such as an insulin receptor and use of the structure thereof(e.g., computer screening, molecular design, antibody design, design ofaltered proteins, and the molecular replacement method) and the like.

LENGTHY TABLES The patent contains a lengthy table section. A copy ofthe table is available in electronic form from the USPTO web site(http://seqdata.uspto.gov/?pageRequest=docDetail&DocID=US07514240B2). Anelectronic copy of the table will also be available from the USPTO uponrequest and payment of the fee set forth in 37 CFR 1.19(b)(3).

1. A crystal of a complex of epidermal growth factor (EGF) comprising:the amino acid sequence of SEQ ID NO:2 and epidermal growth factorreceptor (EGFR) having amino acids 1-619 of SEQ ID NO:1 capable ofdiffracting X-ray at a resolution of 3.3 Å, wherein the crystal has thefollowing characteristics (A), (B), and (C): (A) EGF binds to EGFR at a1:1 ratio; (B) the EGF-bound EGFRs form a dimer; and (C) the crystal hasthe space group P3₁21, the unit cell parameters: a=220.2±1.5 Å,b=220.2±1.5 Å, and c=113.1±1.5 Å, and the bond angles: α=90°, β=90°, andγ=120°.
 2. A method for producing a crystal of EGF-EGFR complex asdefined in claim 1, comprising the following steps of: (A) producingcrystallizable EGFR by genetic engineering techniques using Lec8 cells(ATCC CRL-1737) transformed with DNA encoding the EGFR; (B)deglycosylating the EGFR using a glycosidase, followed by purificationof the resulting EGFR to obtain crystallizable EGFR: (C) bringing thecrystallizable EGFR into contact with EGF; and (D) obtaining a crystalof EGF-EGFR complex from a solution containing the EGF-EGFR complexusing a precipitating agent.
 3. A method for determining the structurecoordinates of an epidermal growth factor (EGR)-epidermal growth factorreceptor (EGFR) complex (EGF-EGFR complex) comprising the followingsteps of: (A) producing a crystal of the EGF-EGFR complex by; 1)producing crystallizable EGFR by genetic engineering techniques usingLec8 cells (ATCC CRL-1737) transformed with DNA encoding the EGFR; 2)deglycosylating the EGFR using a glycosidase, followed by purificationof the resulting EGFR to obtain crystallizable EGFR; 3) bringing thecrystallizable EGFR into contact with EGF; and 4) obtaining a crystal ofEGF-EGFR complex from a solution containing the EGF-EGFR complex using aprecipitating agent; and wherein the crystal of the EFG-EGFR complexcomprises: the EGF having the amino acid sequence of SEQ ID NO:2 and theEGFR having amino acids 1-619 of SEQ ID NO:1, the crystal being capableof diffracting X-ray at a resolution of 3.3 Å, wherein the crystal hasthe following characteristics a), b), and c): a) EGF binds to EGFR at a1:1 ratio; b) the EGF-bound EGFRs form a dimer; and c) the crystal hasthe space group P3₁21, the unit cell parameters: a=220.2±1.5 Å,b=220.2±1 5 Å, and c=113.1±1.5 Å, and the bond angles: α=90°, β=90°, andγ=120°, (B) subjecting the crystal to x-ray diffraction, and determiningthe structure coordinates of the crystal by X-ray crystal structureanalysis.
 4. The method of claim 2, wherein in step (D) theprecipitating agent is polyethylene glycol, and wherein the crystal isobtained using a vapor diffusion method under conditions of a pH of 7.0to 9.0, a protein concentration of between 3 and 15 mg/ml inclusive, anda temperature of 20° C.
 5. The method of claim 2, wherein thepurification comprises a salting out step.
 6. The method of claim 2,wherein the glycosidase is endoglycosidase D or endoglycosidase H, or amixture thereof.
 7. A method for determining the structure coordinatesof an epidermal growth factor (EGF)-epidermal growth factor receptor(EGFR) complex (EGF-EGFR complex) comprising the following steps of: (A)producing a crystal of the EGF-EGFR complex by: 1) producingcrystallizable EGFR by genetic engineering techniques using Lec8 cells(ATCC CRL-1737) transformed with DNA encoding the EGFR; 2)deglycosylating the EGFR using a glycosidase, followed by purificationof the resulting EGFR to obtain crystallizable EGFR: 3) bringing thecrystallizable EGFR into contact with EGF; and 4) obtaining a crystal ofEGF-EGFR complex from a solution containing the EGF-EGFR complex using aprecipitating agent, wherein the crystal of said EFG-EGFR complexcomprises: the EGF having the amino acid sequence of SEQ ID NO:2 and theEGFR having amino acids 1-619 of SEQ ID NO:1, the crystal being capableof diffracting X-ray at a resolution of 3.3 Å, wherein said crystal hasthe following characteristics a), b), and c): a) EGF binds to EGFR at a1:1 ratio; b) the EGF-bound EGFRs form a dimer; and c) the crystal hasthe space group P3₁21, the unit cell parameters: a=220.2 ±1.5 Å,b=220.2±1 5 Å, and c=113.1±1.5 Å, and the bond angles: α=90°, β=90°, andγ=120°, wherein said precipitating agent is polyethylene glycol, andwherein the crystal is obtained using a vapor diffusion method underconditions of a pH of 7.0 to 9.0, a protein concentration of between 3and 15 mg/ml inclusive, and a temperature of 20° C.; and (B) subjectingthe crystal to X-ray diffraction, and determining the structurecoordinates of the crystal by X-ray crystal structure analysis.